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Abstract 

We present some techniques for planning in domains specified with the recent standard 
language pddl2.1, supporting "durative actions" and numerical quantities. These tech- 
niques are implemented in LPG, a domain-independent planner that took part in the 3rd 
International Planning Competition (IPC). LPG is an incremental, any time system pro- 
ducing multi-criteria quality plans. The core of the system is based on a stochastic local 
search method and on a graph-based representation called "Temporal Action Graphs" (TA- 
graphs). This paper focuses on temporal planning, introducing TA-graphs and proposing 
some techniques to guide the search in LPG using this representation. The experimental 
results of the 3rd IPC, as well as further results presented in this paper, show that our 
techniques can be very effective. Often LPG outperforms all other fully-automated plan- 
ners of the 3rd IPC in terms of speed to derive a solution, or quality of the solutions that 
can be produced. 



1. Introduction 

Modeling temporal and numerical information in automated planning is important for rep- 
resenting real-world domains, where actions take time, and consume resources, and the 
quality of the solutions should take these aspects into account. In the '80s and early '90s 
some expressive, but inefficient, planning systems handling time were developed (e.g., Vere, 
1983; Tsang, 1986; Allen, 1991; Penberthy & Weld, 1994). More recently, a number of 
alternative interesting approaches to temporal planning has been proposed (e.g., Smith & 
Weld, 1999; Do k, Kambhampati, 2001; Haslum &: Geffner, 2001; Dimopoulos & Gerevini, 
2002). Some of these planners can compute plans with optimal makespan, but in practice 
most of them scale up poorly. 

Local search is emerging as a powerful method to address fully-automated planning, 
though in principle this approach does not guarantee generation of optimal plans. In par- 
ticular, two planners that successfully participated in the recent 3rd International Planning 
Competition (IPC) are based on local search: ff (Hoffmann &: Nebel, 2001) and lpg. 

In earlier work on LPG (Gerevini Sz Serina, 1999, 2002) we proposed a first version of our 
system using several techniques for local search in the space of action graphs (A-graphs), 
particular subgraphs of the planning graph representation (Blum &: Furst, 1997). This 
version of the planner handled only STRIPS domains, possibly extended with simple costs 
associated with the actions. In this paper, which is a revised and extended version of a recent 
work (Gerevini, Serina, Saetti, Sz Spinoni, 2003), we present some major improvements 
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that were used in the 3rd IPC to handle domains specified in the recent PDDL2.1 language 
supporting "durative actions" and numerical quantities (Fox Sz Long, 2003). 

The general search scheme of our planner is Walk-plan, a stochastic local search procedure 
similar to the well-known Walk-sat (Selman, Kautz, &: Cohen, 1994). Two of the most 
important extensions on which we focus in this paper concern the use of temporal action 
graphs (TA-graphs), instead of simple A-graphs, and some new techniques to guide the local 
search process. In a TA-graph, action nodes are marked with temporal values estimating 
the earliest time when the corresponding action terminates, while fact nodes are marked 
with temporal values estimating the earliest time when the corresponding fact becomes 
true. A set of ordering constraints is maintained during search to handle mutually exclusive 
actions, and to represent the temporal constraints implicit in the "causal" relations between 
actions in the current plan. 

The new heuristics exploit some reachability information to weigh the elements (TA- 
graphs) in the search neighborhood that resolve an inconsistency selected from the current 
TA-graph. The evaluation of these TA-graphs is based on the estimated number of search 
steps required to reach a solution (a valid plan), its estimated makespan, and its estimated 
execution cost, lpg is an incremental planner, in the sense that it produces a sequence 
of valid plans each of which improves the quality of the previous ones. Plan quality is 
modeled by execution and temporal costs in a flexible way (the user can determine the 
relative importance of the plan quality criteria). 

In the 3rd IPC, our planner demonstrated excellent performance on a large set of test 
problems in terms of both speed to compute the first solution and quality of the best solution 
computed by the incremental process, lpg was the fully-automated planner that solved the 
greatest number of problems, and the one with the highest success ratio between attempted 
problems and solved problems. 

The paper is organized as follows. Section 2 presents the action and plan representation 
used in the competition version of lpg. Section 3 describes lpg's local search neighborhood, 
some new heuristics for temporal action graphs, and the techniques for computing the 
reachability and temporal information used in these heuristics. Moreover, in this section 
we describe how LPG handles numerical variables and the incremental process to produce 
good quality plans. Section 4 presents the results of an experimental analysis using the 
test problems of the 3rd IPC, and illustrating the efficiency of our approach especially for 
temporal planning. Section 5 gives conclusions, and mentions current and future work. 
Finally, a collection of appendices describes lpg's algorithm for computing the mutual 
exclusion relations used during search, and gives details about some of the experimental 
results presented in Section 4. 

2. Action and Plan Representation 

In this section we introduce our graph-based representations for STRIPS and temporal plans, 
which can be seen as an elaboration of planning graphs (Blum Sz Furst, 1997). 

2.1 Planning Graphs and Actions Graphs 

A planning graph is a directed acyclic levelled graph with two kinds of nodes and three kinds 
of edges. The levels alternate between a fact level, containing fact nodes, and an action 
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level containing action nodes. An action node at a level t represents an action (instantiated 
operator) that can be planned at time step t. A fact node at a level t represents a proposition 
corresponding to a precondition of one or more actions at time step t, or to an effect of one 
or more actions at time step t — 1. The fact nodes of level 1 represent the positive facts 
of the initial state of the planning problem (every fact that is not mentioned in the initial 
state is considered false). 

In the following, we indicate with [u] the proposition (action) represented by the fact 
node (action node) u. The edges in a planning graph connect action nodes and fact nodes. 
In particular, an action node a at a level i is connected by: precondition edges from the 
fact nodes of level i representing the preconditions of [a]; add- edges to the fact nodes of 
level i + 1 representing the positive effects of [a]; delete-edges to the fact nodes of level i + l 
representing the negative effects of [a]. Each fact node / at a level I is associated with a 
no-op action node at the same level, which represents a dummy action having [/] as its only 
precondition and effect. 

Two action nodes a and b are marked as mutually exclusive in the graph when one of 
the actions deletes a precondition of the other (interference) or an add-effect of the other 
(inconsistent effects), or when a precondition node of a and a precondition node of b are 
marked as mutually exclusive (competing needs). 

Two proposition nodes p and q in a proposition level are marked as exclusive if all ways 
of making proposition [p] true are exclusive with all ways of making [q] true (each action 
node a having an add-edge to p is marked as exclusive with each action node b having an 
add-edge to q). When two fact or action nodes are marked as mutually exclusive, we say 
that there is a mutex relation (or simply a mutex) between them. 

Given a planning problem IT, the corresponding planning graph Q can be incrementally 
constructed level by level starting from level 1 using a polynomial algorithm (Blum &: Furst, 
1997). The graph construction should reach a propositional level where the goal nodes are 
present, and there is no mutex relation between them. 1 The fixed-point level of the graph 
is the level from which the nodes and mutex relations at every subsequent level remain the 
same. Blum and Furst refer to this level as the level where the graph has "leveled off". 
The mutex relations in the planning graph monotonically decrease with the increase of the 
levels: a mutex relation holding at a certain level may not hold at the next levels, but it is 
guaranteed that it holds at all previous levels containing the fact/action nodes involved in 
the relation. The mutex relations at the fixed-point level of the graph are called persistent 
mutex relations (Fox fa Long, 2000), because they hold at every level of the graph. 

Without loss of generality, we can assume that the goal nodes of the last level represent 
the preconditions of the special action [a e nd]j which is the last action in any valid plan, while 
the fact nodes of the first level represent the effects of the special action [a sta rt\, which is 
the first action in any valid plan. 

Our approach to planning uses particular subgraphs of Q, called action graphs, which 
represent partial plans. 

Definition 1 An action graph (A-graph) for Q is a subgraph A of Q containing a s t ar t 
and a enc i, and such that, if a is an action node of Q in A, then also the fact nodes of Q 

1. In some cases, when the problem is not solvable, the algorithm identifies that there is no level satisfying 
these conditions, and hence it detects that the problem is unsolvable. 
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corresponding to the preconditions and positive effects of [a] are in A, together with the 
edges connecting them to a. 

Notice that an action graph can represent an invalid plan for the problem under consid- 
eration, since it may contain some inconsistencies, i.e., an action with precondition nodes 
that are not supported, or a pair of action nodes involved in a mutex relation. In general, 
a precondition node q at a level i is supported in an action graph A of Q if either (i) in A 
there is an action node at level i — 1 representing an action with (positive) effect [q], or (ii) 
i = 1 (i.e., [q] is a proposition of the initial state). An action graph without inconsistencies 
represents a valid plan and is called a solution graph. 

Definition 2 A solution graph for Q is an action graph A s of Q such that all precondition 
nodes of the actions in A s are supported, and there is no mutex relation between action nodes 
of A s . 

For large planning problems the construction of the planning graph can be computa- 
tionally very expensive, especially because of the high number of mutex relations. For this 
reason our planner considers only pairs of actions that are persistently mutex, derived using 
a dedicated algorithm given in Appendix A. An experimental comparison with ipp's imple- 
mentation of the planning graph construction (Koehler, Nebel, Hoffmann, h Dimopoulos, 
1997) showed that in practice our method for deriving mutex relations is considerably more 
efficient than the "traditional" method for deriving the mutex relations in the fixed-point 
level of the graph. Moreover, for the problems that we tested, our method derived all the 
persistent mutex relations found by the traditional method. 

The definition of action graphs and the notion of supported facts can be made stronger 
by observing that the effects of an action node can be automatically propagated to the next 
levels of the graph through the corresponding no-ops, until there is an interfering action 
blocking the propagation (if any), or the last level of the graph has been reached. The use 
of the no-op propagation, that we presented in previous work (Gerevini Sz Serina, 2002), 
leads to a smaller search space and can be incorporated into the definition of action graph. 

Definition 3 An action graph with propagation is an action graph A such that if a is 
an action node of A at a level I, then, for any positive effect [e] of [a] and any level I' > I 
of A, the no-op of e at level I' is in A, unless there is another action node at a level I" 
(I < I" < I') which is mutex with the no-op. 

Since in the rest of this paper we consider only action graphs with propagation, we will 
abbreviate their name simply to action graphs (leaving implicit that they include the no-op 
propagation) . 

In most of the existing planners based on planning graphs, when the search for a solution 
graph fails, Q is iteratively expanded by adding an extra level and performing a new search 
using the resulting graph. In systematic planners like GRAPHPLAN (Blum &: Furst, 1997), 
STAN (Fox Sz Long, 1998b) and IPP (Koehler et al., 1997) the search fails when there exists 
no solution graph, while in planners that use local search like blackbox (Kautz & Selman, 
1999) or GPG (Gerevini & Serina, 1999) the search fails when a certain search limit is 
exceeded. As we will show, in LPG there is no need to explicitly treat this kind of search 
failure, since the size of the graph is incrementally increased during search (i.e., the graph 
extension can be part of a local search step). 
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2.2 Linear and Temporal Action Graphs 

The first version of LPG (Gerevini Sz Serina, 2002) was based on action graphs where each 
level may contain an arbitrary number of action nodes, as in the usual definition of planning 
graph. The version of the system that participated in the 3rd IPC uses a restricted class of 
action graphs, called linear action graphs, combined with some additional data structures 
supporting a more expressive action and plan representation. In particular, the new system 
can handle actions having temporal durations and preconditions/effects involving numerical 
quantities, as specified in PDDL2.1 (Fox Sz Long, 2003). In this paper we focus mainly on 
planning for temporal domains, where LPG showed particularly good performance with 
respect to the other (fully-automated) participants of the 3rd IPC. 

In order to keep the presentation simple, we describe our techniques considering mainly 
preconditions of type "over all" (i.e., preconditions that must hold during the whole action 
execution) and effects of type "at end" (i.e., effects that hold at the end of the action 
execution). 2 In Section 3.4 we discuss how we handle the other types of preconditions and 
effects in the test domains of the 3rd IPC. 

Definition 4 A linear action graph (LA-graph) of Q is an A-graph of Q in which each 
level of actions contains at most one action node representing a domain action and any 
number of no-ops. 

It is important to note that having only one action in each level of an LA-graph does 
not prevent the generation of parallel (partially ordered) plans. In fact, from any LA-graph 
we can easily extract a partially ordered plan where the ordering constraints are (1) those 
between mutex actions and (2) those implicit in the causal structure of the represented plan. 
Regarding the first constraints, if a and b are mutex and the level of a precedes the level of 
6, then [a] is ordered before [b]; regarding the second constraints, if a has an effect node that 
is used (possibly through the no-ops) to support a precondition node of 6, then [a] is ordered 
before [b]. These causal relations between actions producing an effect and actions consuming 
it are similar to the causal links in partial-order planning (e.g., McAllester &: Rosenblitt, 
1991; Penberthy Sz Weld, 1992; Nguyen Sz Kambhampati, 2001). LPG keeps track of these 
relationships during search and uses them to derive some heuristic information useful for 
guiding the search (more details on this in the next section), as well as to extract parallel 
plans from the solution graph. 

For temporal domains where actions have durations and plan quality mainly depends 
on the makespan, rather than on the number of actions or graph levels, the distinction 
between one action or more actions per level is scarcely relevant. The order of the graph 
levels should not imply any ordering of the actions (e.g., an action at a certain level could 
terminate before the end of an action at the next level). 

Since in LA-graphs there is at most one action node for each level, and every inconsis- 
tency is an unsupported precondition, the use of this representation has some advantages 
over general A-graphs: 

• LA-graphs can be represented by simpler data structures, which allow one to manage 
the no-op propagation, the inconsistency identification and selection, and the numer- 
ical effect propagation more efficiently. 

2. LPG supports all types of preconditions and effects that can be expressed in PDDL2.1 (levels 1-3). 
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• LA-graphs better support the computation of the heuristic and reachability informa- 
tion used by the local search algorithm presented in the next section. As we will see, 
for these techniques it is important to derive a consistent and possibly complete de- 
scription of the state where any action in the current plan is applied. In an LA-graph, 
we can efficiently derive these state descriptions by using the levels of the graph as a 
total order of the actions in the current plan. 

• In numerical domains, if mutex actions can belong to the same level of the current 
action graph, it could be impossible to determine whether a numerical precondition of 
an action at a following level is satisfied. 3 In an LA-graph (persistent) mutex actions 
belong to different levels and are ordered, making this easy to determine. 

Also note that the fact of having only one action per level allows us to define a larger 
search neighborhood. In general, a disadvantage of LA-graphs with respect to A-graphs is 
the size of the representation, since the number of levels in an LA-graph can be significantly 
larger than the number of levels in the corresponding A-graph. However, in all planning 
problems that we tested, the size of LA-graphs was never a problem for our planner. 4 

For PDDL2.1 domains involving durative actions, our planner represents temporal infor- 
mation by an assignment of real values to the action and fact nodes of the LA-graph, and 
by a set Q of ordering constraints between action nodes. The value associated with a fact 
node / (Time(f)) represents the earliest time when [/] becomes true, given the actions in 
the represented plan and the constraints in Q; the value associated with an action node a 
(Time (a)) represents the earliest time when the execution of [a] can terminate. These tem- 
poral values are derived from the duration of the actions in the LA-graph and the ordering 
constraints between them that are stated in 0. 

Definition 5 A temporal action graph (TA-graph) of Q is a triple (A,T,Q) where 

• A is a linear action graph; 

• T is an assignment of real values to the fact and action nodes of A; 

• Q is a set of ordering constraints between action nodes of A. 

The ordering constraints in a TA-graph are of two types: constraints between actions 
that are implicitly ordered by the causal structure of the plan (-<c- constraints), and con- 
straints that are imposed by the planner to deal with mutually exclusive actions (-<e~ 
constraints) . a -<c b belongs to Q if and only if a is used to achieve a precondition node 
of b in A, while a -<e b (or b -<e o) belongs to O, only if a and b are mutually exclusive 
in A (a -<e b, if the level of a precedes the level of 6, b -<e cl otherwise). In Section 3.4 
we will discuss how ordering constraints are introduced by LPG during the search. Given 
our assumption on the types of action preconditions and effects in temporal domains, an 

3. For instance, suppose we have an A-graph with two mutex actions at a level such that one action sets 
the value of the numerical variable x to 10, while the other sets it to 20. Unless we order these actions, 
it is impossible to determine whether x > 15 holds when the action at the next level is applied. 

4. lpg's implementation of LA-graphs uses an extended version of Hoffmann's "connectivity graph", a 
compact representation of the action and fact nodes in a planning graph (Hoffmann & Nebel, 2001). 
The extensions are needed to represent persistent mutex relations, durative actions and numerical 
preconditions/effects. 
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I NIT 

Q = {cii -< c «4! a-2 <c «i <E fl2!«2 <E «2 <E 04} 



Figure 1: An example of TA-graph. Dashed edges form chains of no-ops that are blocked by 
mutex actions. Round brackets contain temporal values assigned by T to the fact 
nodes (circles) and the action nodes (squares). The numbers in square brackets 
represent the durations of the actions. "(-)" indicates that the corresponding fact 
node is not supported. 



ordering constraint a -< b (where stands for -<c or -<e) states that the end of [a] is 
before the start of [b]. The temporal value assigned by T to a node x, denoted by Time(x), 
is derived as follows. If a fact node / of the action graph is unsupported, then Time(f) 
is undefined, otherwise it is the minimum value over the temporal values assigned to the 
actions supporting it. If the temporal value of every precondition node of an action node 
a is undefined, and there is no action node with a temporal value that must precede a 
according to then Time(a) is set to the duration of a; otherwise Time(a) is the sum 
of the duration of a and the maximum value over the temporal values of its precondition 
nodes and the temporal values of the action nodes that must precede a. 

Figure 1 gives an example of a TA-graph containing four action nodes (ai...4) and several 
fact nodes representing thirteen facts. Since a\ supports a precondition node of 04, a\ -<c Q>4 
belongs to 0, (similarly for ci2 -<c a z)- a i a 2 also belongs to Q because a\ and are 
persistently mutex (similarly for <e 0,3 and <e <m)- The temporal value assigned to 
the facts /1...5 at the first level is zero, because they belong to the initial state. a\ has all 
its preconditions supported at time zero, and hence Time{a\) is the duration of a\. Since 
a\ -< 0,2 € fi, Timeia^) is given by the sum of the duration of ai and the maximum value 
over the temporal values of its precondition nodes (zero) and Time{a\). Since fg at level 3 
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is supported only by 0,2, and this is the only supported precondition node of as, Time (03) 
is the sum of Time{a,'i) = Time(fg) and the duration of 03. Since ai must precede 04 
(but there is no ordering constraint between 03 and 04), Time{a^) is the maximum value 
over Timeia-i) and the temporal values of its supported precondition nodes (fe), plus the 
duration of 04. Finally, note that /12 at the last level is supported both by 04 and 03. Since 
Time{a?,) > Timefa^), we have that Time(f 12) at this level is equal to Time(a^). 

Definition 6 A temporal solution graph for Q is a TA-graph (A, T, 0) such that A is 
a solution LA-graph of Q, T is consistent with Q and the duration of the actions in A, Q is 
consistent, and for each pair (a, b) of mutex actions in A, either £1 \= a -< b or Q \= b -< a. 

While obviously the levels in a TA-graph do not correspond to real time values, they 
represent a topological order for the -^-constraints i n the TA-graph (i.e., the actions of the 
TA-graph that are ordered according to their relative levels form a linear plan satisfying 
all -^-constraints). This topological sort can be a valid total order for the -^-constraints 
of the TA-graph as well, provided that these constraints are appropriately stated during 
search, i.e., that if a and b are exclusive, the planner appropriately imposes either a -<e b 
or b -<e a. lpg chooses a -<e b if the level of a precedes the level of 6, b -<e o, otherwise. 
Under this assumption on the "direction" in which -< ^-constraints are imposed, it is easy 
to see that the levels of a TA-graph correspond to a topological order of the actions in the 
represented plan satisfying every ordering constraint in f2. 

For planning domains that require minimizing the plan makespan (like the "Time", 
"SimpleTime" , "Complex" , and some of the "Numeric" and "HardNumeric" domain sets 
of the 3rd IPC) each element of lpg's search space is a TA-graph. For domains where time 
is irrelevant (like the "Strips" and "Numeric" domain sets of the 3rd IPC) the search space 
is formed by LA-graphs. 5 

2.3 Action Durations and Costs 

In this section we comment on the representation of action durations and action costs in 
lpg. In accordance with pddl2.1, our planner handles both static durations and dynamic 
durations, i.e., durations depending on the state in which the action is applied. Static 
durations are either explicitly given as numbers specified in the field ": duration" of the 
operator description, or they are implicitly specified by an expression involving some static 
quantities specified in the initial state of the planning problem. An example of implicit 
static duration is the duration of the Drive actions in the Depots-Time domain of the 3rd 
IPC: the Drive operator defines the duration as the distance between the source and the 
destination of travel (two operator parameters instantiated by values specified in the initial 
state), divided by the speed of the vehicle that is driven (another operator parameter). 

Typically, dynamic durations depend on some numeric quantities that may vary from 
one state to another state reached by the actions in the plan. An example is the energy 
of a rover in the domain Rovers-Time of the 3rd IPC, where the duration specified in the 
recharge operator is 

(/ (- 80 (energy ?x)) (re charge-rate ?x))). 

5. An experimental analysis showed that in strips domains the techniques for LA-graphs are more powerful 
than the techniques for A-graphs that we proposed in previous work (Gerevini & Serina, 2002). 
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This expression depends on the current value of energy for the rover ?x and on its static 
recharging rate (recharge-rate) specified in the initial state. Our planner handles the 
dynamic duration of an action by computing and maintaining during search an estimate of 
the value of the numerical quantities in the state where the action is applied. In Section 
3.5 we will briefly describe how the version of lpg that took part in the 3rd IPC handles 
numerical state variables, and numerical preconditions and effects involving them. However, 
in this paper we will not describe their treatment in detail, and in the next section we will 
assume that action durations are static. 

Each action of a plan can be associated with a cost that may affect the plan quality. 
Like action durations, in general these costs could be either static or dynamic, though the 
current version of LPG handles only static ones. LPG precomputes the action costs using the 
plan metric specified in the problem description using the PDDL2.1 field ": metric". 6 For 
instance, the plan metric used for a problem in the ZenoTravel-Numeric domain of the 3rd 
IPC is 

(:metric minimize (+ (* 4 (total-time)) (* 5 (total-fuel-used)))), 

i.e., it is the sum of four times the plan makespan and five times the total amount of the fuel 
used by the actions in the plan. The cost of an action a is derived by evaluating how the 
value of the plan metric expression is changed by the effects of a. LPG computes an initial 
value mo for the expression by using the values specified in the initial state as values of the 
involved numerical variables. From mo LPG derives a new value mi by applying the effects 
of a that increase/decrease the value of one or more variables in the expression. The cost of 
a is defined as mi — mo- If this difference is zero, in order to prefer plans containing a lower 
number of actions, the cost of a is set to a small positive quantity. Notice also that in these 
evaluations of the metric expression the temporal value total-time is not considered (it is 
set to zero, if present), because the temporal aspect of the plan quality is already taken into 
account by the durations of the actions. In the previous example, the metric subexpression 
used to derive the action costs is (* 5 (total-fuel-used))). Thus, for instance, the cost 
of the ZenoTravel action (fly planel cityO cityl) in the problem pfilel of the 3rd 
IPC is 13560 because the effects of this action increase total-fuel-used by the following 
quantity 

(* (distance cityO cityl) (slow-burn planel))) = 678 * 4 = 2712, 
which increases the metric value of the plan by 5 * 2712 = 13560. 

3. Local Search in the Space of Temporal Action Graphs 

In this section we present some search techniques used in LPG. We start with a description 
of the general local search scheme in the space of action graphs. Then we concentrate on 
temporal action graphs giving a detailed description of lpg's heuristics and of its methods 
for computing and using reachability information, for maintaining the TA-graph represen- 
tation during search, and for deriving good quality plans incrementally. In order to simplify 
the notation, instead of using a and [a] to indicate an action node and the action repre- 



6. As in the domains of the competitions, we assume that the plan metric expression is linear. For simple 
strips domains, where there is no metric expression to minimize, the cost of each action is set to one, 
and LPG minimizes the number of actions in the plan. 
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sented by this node respectively, we will use a to indicate both of them (the appropriate 
interpretation will be clear from the context). 

3.1 Basic Search Procedure: Walkplan 

Given a planning graph £/, the local search process of LPG starts from an initial A-graph of 
Q (i.e., a partial plan), and transforms it into a solution graph (i.e., a valid plan) through 
the iterative application of graph modifications improving the current partial plan. The two 
basic modifications consist of an extension of the A-graph to include a new action node, or 
a reduction of the A-graph to remove an action node (and the relevant edges). 7 At any step 
of the search process, which produces a new A-graph, the set of actions that can be added 
or removed is determined by the inconsistencies that are present in the current A-graph. 

The general scheme for searching for a solution graph (a final state of the search) consists 
of two main steps. The first step is an initialization of the search in which we construct 
an initial A-graph. The second step is a local search process in the space of all A-graphs, 
starting from the initial A-graph. We can generate an initial A-graph in several ways. Four 
possibilities that can be performed in polynomial time, and that we have implemented are: 
an empty A-graph (i.e., containing only the no-ops of the facts in the initial state, and the 
special action nodes a s t ar t an d a e nd)> a randomly generated A-graph; an A-graph where all 
precondition facts are supported, but in which there may be some violated mutex relations; 
and an A-graph obtained from an existing plan given as input to the process. The last 
option is particularly useful in the plan optimization phase, as well as for solving plan 
adaptation problems (Gerevini Sz Serina, 2000). In the current version of LPG, the default 
initialization strategy is the empty action graph with the fixed-point level as the last level 
of the graph. Further details on the initialization step can be found in earlier papers on 
planning through local search and action graphs (Gerevini Sz Serina, 1999, 2000). 

Once we have computed an initial A-graph, each basic search step selects an inconsis- 
tency in the current A-graph. If this is an unsupported fact node, then in order to resolve 
(eliminate) it, we can either add an action node that supports it, or we can remove an action 
node that is connected to that fact node by a precondition edge. If the chosen inconsistency 
is a mutex relation, then we can remove one of the action nodes of the mutex relation. Note 
that the elimination of an action node can remove several inconsistencies (e.g., all those cor- 
responding to the unsupported preconditions of the action removed). On the other hand, 
obviously the addition of an action node can introduce several new inconsistencies. The 
strategy for selecting the next inconsistency to handle may have a significant impact on the 
overall performance (this has been extensively studied in the context of causal-link partial- 
order planning, e.g., Pollack, Joslin, Sz Paolucci, 1997; Gerevini Sz Schubert, 1996). Our 
planner includes several strategies that we are currently testing. The default strategy that 
we used in the 3rd IPC and in all experiments presented in Section 4, prefers inconsistencies 
appearing at the earliest level of the graph. 

Given an action graph A and an inconsistency a in A, the neighborhood N(a, A) of a 
in A is the set of A-graphs obtained from A by applying a graph modification that resolves 

7. Another possible modification that is analyzed by Gerevini and Serina (2002), but that will not be 
considered in this paper, is action ordering, i.e., moving forward or backward one of two exclusive action 
nodes. 
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Walkplan(n, maxsteps, maxjrestarts^p) 

Input: A planning problem IT, the maximum number of search steps maxsteps, 

the maximum number of search restarts max-restarts, a noise factor p (0 < p < 1). 
Output: A solution graph representing a plan solving II or fail. 

1. for i <— 1 to max .restarts do 



2. A <— an initial A-graph derived from the planning graph of II; 

3. for j <— 1 to maxsteps do 

4. if A is a solution graph then 

5. return A 

6. a <— an inconsistency in A; 

7. N(o, A) <— neighborhood of A for a: 

8. if 3 _4.' G N(a, A) such that the quality of A' is not worse than the quality of A 

9. then A <— A' (if there is more than one .A'-graph, choose randomly one) 

10. else if random < p then 

11. A <— an element of N(cr,A) randomly chosen 

12. else A <— best element in N(cr, A); 



13. return fail. 

Figure 2: General scheme of Walkplan with restarts, random is a randomly chosen value 
between and 1. The quality of an action graph in the neighborhood is measured 
using an evaluation function estimating the cost of the graph modification used 
to generate it from the current action graph. 



a. At each step of the local search scheme, the elements of the neighborhood are evaluated 
according to a function estimating their quality, and an element with the best quality is 
then chosen as the next possible A-graph (search state). The quality of an A-graph depends 
on a number of factors, such as the number of inconsistencies and the estimated number of 
search steps required to resolve them, the overall cost of the actions in the represented plan 
and its makespan. 8 

Gerevini and Serina (1999) proposed three general strategies for guiding the local search: 
Walkplan, Tabuplan and T-Walkplan. In this paper we focus on Walkplan, which is the 
strategy used by LPG in the 3rd IPC, as well as in the experimental tests presented in Section 
4. Walkplan is similar to Walksat, a stochastic local search method for solving propositional 
satisfiability problems (Selman et al., 1994; Kautz & Selman, 1996). In Walkplan the best 
element in the neighborhood is the A-graph which has the lowest decrease of quality with 
respect to the current A-graph, i.e., it does not consider possible improvements. Like 
Walksat, our strategy uses a noise parameter p. Given an A-graph A and an inconsistency er, 
if there is a modification for a that does not decrease the quality of A, then this modification 
is performed, and the resulting A-graph is chosen as the next A-graph; otherwise, with 

8. For simple strips domains the execution cost of the plan is measured in terms of the number of actions 
(i.e., each action has cost 1), while plan makespan is ignored. Alternatively it can be modeled as the 
number of parallel time steps (Gerevini & Serina, 2002). 
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probability p one of the graphs in N(a, A) is chosen randomly, and with probability 1 — p 
the next A-graph is chosen according to the minimum value of the evaluation function. 
If a solution graph is not reached after a certain number of search steps (maxsteps), the 
current A-graph and maxsteps are reinitialized, and the search is repeated up to a user- 
defined maximum number of times (max-restarts) . Figure 2 gives a formal description of 
Walkplan with restarts. 

Gerevini and Serina (2002) proposed some heuristic functions for evaluating the search 
neighborhood of A-graphs with action costs. In the next section we present additional, 
more powerful heuristic functions for LA-graphs and TA-graphs. These techniques are 
implemented in the latest version of our planner and were used in the 3rd IPC. 

3.2 Neighborhood and Heuristics for Temporal Action Graphs 

The search neighborhood for an inconsistency a in an LA-graph A is the set of LA-graphs 
that can be derived from A by adding an action node supporting cr, or removing the action 
with precondition a (in linear graphs the only type of inconsistencies are unsupported 
preconditions). An action a supporting a can be added to A at any level I preceding the 
level of cr, and such that the desired effect of a is not blocked before or at the level of a 
(assuming that the underlying planning graph contains a at level /). The neighborhood for 
a contains a linear action graph for each of these possibilities. 

Since at any level of an LA-graph there can be at most one action node (plus any number 
of no-ops), when we remove an action node from A, the corresponding action level becomes 
"empty" (i.e., it contains only no-ops). 9 If the LA-graph contains adjacent empty levels, 
and in order to resolve the selected inconsistency a certain action node can be added at 
any of these levels, then the corresponding neighborhood contains only one of the resulting 
graphs. 

When we add an action node to a level I that is not empty, the LA-graph is extended 
by one level, all action nodes from I are shifted forward by one level, and the new action is 
inserted at level I (Figure 8 in Section 3.4 gives an example). Moreover, when we remove an 
action node a from the current LA-graph, we can also remove each action node b supporting 
only the preconditions of a. Similarly, we can remove the actions supporting only the 
preconditions of 6, and so on. While this induced pruning is not necessary, an experimental 
analysis showed that it tends to produce better quality plans more quickly. 

The elements of the neighborhood are evaluated according to an action evaluation func- 
tion E estimating the cost of adding (E(a) % ) or removing an action node a (E(a) r ). In 
general, E consists of three weighed terms evaluating three aspects of the quality of the 
current plan that are affected by the addition/removal of a: 



The first term of E estimates the increase of the plan execution cost (Execution-cost) , 
the second estimates the end time of a (Temporal -cost), and the third estimates the increase 

9. Note that the empty levels are ignored during the extraction of the plan from the (temporal) solution 



E( a y 



a ■ Execution-Cost[a) % + (3 ■ Temporal -cost (a) 1 + 7 • Search-cost(a)' 1 



E{a) = < 




a ■ Execution-Cost(a) r + (3 ■ Temporal -cost (a) r + 7 • Search-Cost(a) r 



graph. 
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of the number of search steps needed to reach a solution graph (Search-cost). The coef- 
ficients of these terms are used to normalize them, and to weigh their relative importance 
(more on this in Section 3.6). 

In the computation of the terms of E there is an important tradeoff to consider. On one 
hand, an accurate evaluation of them could lead to valid plans of good quality within few 
search steps. On the other hand, the computation of E should be fast "enough" , because 
the neighborhood could contain many elements, and an accurate evaluation of its elements 
could slow down the search excessively. In the design of our heuristics for evaluating the 
terms of E we took this tradeoff into account trying to find an appropriate balance between 
informativeness and efficiency of computation. 

The evaluation of the terms of E is based on computing particular relaxed plans for 
achieving certain action preconditions in the context of the current TA-graph. In the next 
subsections, first we describe how these relaxed plans are derived, and then we give a 
detailed description of how the terms of E are defined using relaxed plans. 

3.2.1 Relaxed Plans for Action Preconditions 

Suppose we are evaluating the addition of a at a level I of the current linear action graph A. 
The three terms of E are heuristically estimated by computing a relaxed plan ir r containing 
a minimal set of actions for achieving (1) the unsupported preconditions of a and (2) the 
set E of preconditions of the other actions in the LA-graph that would become unsupported 
by adding a (because it would block the no-op propagation currently used to support such 
preconditions). This plan is relaxed because during its construction we do not consider the 
possible interference between actions resulting from delete-effects. 

7r r is computed in two stages. First we deal with the preconditions of type (1) and then 
with the preconditions of type (2). The generation of n r depends on the actions in the 
current partial plan (the plan represented by A) in two ways: 

• The actions in the current plan are used to define an initial state for the problems of 
achieving the preconditions of a and those in S. In particular, the relaxed subplan 
for the preconditions of a is computed from the state INITi obtained by applying the 
actions in A up to level I — 1, ordered according to their corresponding levels. 10 The 
relaxed subplan for achieving £ is computed from INITi modified by the effects of 
a, and it can reuse the actions in the relaxed subplan previously computed for the 
preconditions of a. 

• In the process of deriving a relaxed plan, when we choose an action, we consider its 
potential interference with the no-ops that support a precondition of some action in 
A at a level following I, and we prefer actions that do not block the propagation of 
such no-ops. The motivation is that taking these interferences into account during 
the construction of a relaxed plan can lead to a better estimate of the cost required 
to support preconditions of type (1) and (2) in the context of the search that we are 
conducting to transform the current action graph into a solution graph (more details 
below). 

10. Notice that, as we pointed out in the previous section, the levels in a TA-graph correspond to a total 
order of the actions of the represented partial-order plan that is consistent with the ordering constraints 
in O (though, of course, this is not necessarily the only valid total order). 
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We indicate with Threats(a) the set of preconditions of the actions in A that would 
become unsupported when adding a (similarly for the action preconditions that could be 
subverted by an action in the relaxed plan). Using the causal-link notation of partial-order 
planners (e.g., McAllester k Rosenblitt, 1991; Penberthy k Weld, 1992), Threats(a) can be 
formally defined in the following way 



Note that, according to our representation, b —> c implies Level(b) < Level(a) < Level(c), 
where Level(x) denotes the level of x in A. 

Figure 3 gives a recursive algorithm for computing our relaxed plans, RelaxedPlan, 
which uses the following additional notation. Duration(a) denotes the duration of a; 11 
Pre(a) denotes the precondition nodes of a; Add{a) denotes the (positive) effect nodes of 
a; Supported- f acts(l) denotes the set of positive facts that are true after executing the 
actions at levels that precede I (ordered according to their level); Num-acts(p,l) denotes 
an estimated minimum number of actions required to reach p from Supported-f 'acts (I) (if 
p is not reachable, Num-acts(p,l) is a negative number). The technique for computing 
Nurri-acts is described in Section 3.3. 

Given a set G of goal facts, an initial state INITi, and a possibly empty set of actions A, 
RelaxedPlan computes a pair Rplan = (ACTS,t) where: ACTS is a set of actions including 
A and forming a relaxed plan achieving G from INITi; t is a temporal value estimating the 
earliest time when all facts in G are achieved. The first element of Rplan is indicated with 
Aset(Rplan), the second with End-time(Rplan). 

As mentioned above and described in detail in Section 3.2.3, when we evaluate the 
addition of an action a, RelaxedPlan is run twice: first to compute a relaxed plan for the 
preconditions of a, and then to extend this plan for achieving the preconditions that would 
be subverted by a (i.e., Threats(a)). The input set A is the set of actions currently in the 
relaxed plan that can be "reused" to achieve an action precondition or goal of the relaxed 
(sub)problem. A is not empty whenever RelaxedPlan is recursively executed, and when it is 
run to achieve Threats(a). 

RelaxedPlan constructs Rplan through a backward process where Bestaction(g) is the 
action a' chosen to achieve a (sub)goal </, and such that: (i) g is an effect of a'; (ii) all 
preconditions of a' are reachable from INITi; (hi) the reachability of the preconditions 
of a' requires a minimum number of actions, estimated as the maximum of the heuristic 
minimum number of actions required to support each precondition p of a' from INITi (i- e - ; 
the maximum of Num-acts(p, I) over each precondition p of a'): (iv) a' subverts a minimum 
number of supported precondition nodes in A (i.e., the size of the set Threats(a') is minimal). 
More formally, 



where F is the set of positive effects of the actions currently in ACTS, and A g is the set of 
actions with the effect g and with reachable preconditions, i.e., 

11. If the duration of a is dynamic (it depends on the value of one or more numerical variables), it is 
computed using the values of the numerical variables in INITi ■ 



Threats(a) = {/ | no-op(/) and a are mutex; 3 b, c £ 



A such that b — > c}. 




MAX 

p£Pre(a')-F 
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RelaxedPlan(G, INIT h A) 

Input: A set of goal facts (G), the set of facts that are true after executing the actions of 
the current TA-graph up to the level I (INITi), a possibly empty set of actions (A); 

Output: A set of actions and a real number, estimating a minimal set of actions required 
to achieve G and the earliest time when all facts in G can be achieved, respectively. 

1. i<- MAX Timet g): 

geGtllNITi 

2. G<-G-INITr, ACTS ^ A; 

3. F <— [Ja^ACTS Add(a); 

4. t <- MAX It, MAX T(q)\: 

5. while G-F^Q) 

6. g <— a fact in G — F; 

7. bestact <— Bestaction(g): 

8. RplarHr- Re\axedP\an(Pre(bestact), INIT h ACTS); 

9. forall / G Add{bestact) - F 

10. T(f) <— End-time(Rplan) + Duration(bestact); 

11. ACTS' <- Aset(Rplan) U {6es£ac£}; 

12. F^UaeACTS^(a); 

13. i •<— M AX '{t, End-time(Rplan) + Duration(bestact)}; 

14. return (ACTS,t). 



Figure 3: Algorithm for computing a relaxed plan achieving a set of action preconditions 
from the state INITi. Rplan is a pair of values (Aset(Rplan), End-time(Rplan)) , 
where the first value is a set of actions and the second is a temporal quantity. 
Bestaction(g) is the action that is heuristically chosen to support g as described 
in the text. 



A g = {a e O | g e Add(a), O is the set of all actions, \/p € Pre(a) Num.acts(p) > 0}. 12 

Notice that the set of actions O in the definition of A g does not contain operator instances 
with mutually exclusive preconditions. The reason why Bestaction(g) considers the cost 
of the preconditions in Pre(a') — F, instead of in Pre(a'), is that the preconditions of a' 
that are in F are already supported by other actions currently in the relaxed plan under 
construction. 

Requirements (i) and (ii) for the definition of Bestaction are obvious. Regarding (iii) 
and (iv), we considered alternative versions that are implemented in lpg, but that are not 
used as default strategies because we experimentally found that on average they lead to a 

12. In principle A g can be empty because g might not be reachable from INITi (i.e., Bestaction(g) = 0). 
RelaxedPlan treats this special case by forcing its termination and returning a set of actions including 
a special action with a very high cost, leading E to consider the element of the neighborhood under 
evaluation a bad possible next search state. For clarity we omit these details from the formal description 
of the algorithm. 
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worse performance. In particular, instead of using the maximum of the heuristic minimum 
number of actions required to support each precondition a' , we tested the use of the sum 
of such numbers, which can give an overestimation of the actual search cost. We have 
also tested a version of Bestaction which does not consider (iv), i.e., without the term 
\Threats(a')\. While this simplified version is faster to compute, overall the performance 
of the planner was on average worse both in terms of CPU-time and quality of the plans 
produced (detailed results of this experiment are available from the web page of lpg). 

Steps 1, 4 and 13 of RelaxedPlan estimate the earliest time required to achieve all goals 
in G. This is recursively defined as the maximum of 

(a) the times assigned to the facts in G that are already true in the state INITi (step 1); 

(b) the estimated earliest time T(g) required to achieve every fact g in G that is an effect 
of an action currently in ACTS (step 4); 

(c) the estimated earliest time required to complete the execution of the actions chosen 
by Bestaction to achieve each of the remaining facts in G (step 13). 

The T-times of (b) are computed by steps 9-10 from the relaxed subplan derived to 
achieve them. Clearly the algorithm terminates, because either every (sub)goal p is reach- 
able from INITi (i.e., Nuni-acts(p,l) > 0), or at some point bestact = holds, forcing 
immediate termination (see footnote 12). Moreover, it can be proved that the complexity 
of the algorithm is polynomial in the number of actions and facts in the planning prob- 
lem/domain. 

3.2.2 An Example Illustrating RelaxedPlan 

Suppose we are evaluating the addition of a to the current TA-graph A illustrated in Figure 
4. For each fact that is used in the example, the tables of Figure 4 give the relative 
Num-acts-value or the temporal value (Num-acts for the unsupported facts, Time for the 
other nodes). The Nurri-acts- value for a fact belonging to INITi is zero. The duration of 
the actions used in the example are indicated in the corresponding table of Figure 4. Solid 
circle and square nodes represent precondition and action nodes in A U {a}; dotted circle 
and square nodes represent the precondition and action nodes that are considered during 
the evaluation process; finally, the gray circle and square nodes represent the precondition 
and action nodes that are selected by RelaxedPlan. 

First we describe the derivation of the sets of actions in the relaxed plan for Pre (a) and 
Threats(a), i.e., 

51 = Aset(Re\axedP\an(Pre(a), INIT h $)) and 

5 2 = Aset{Re\axedP\an{Threats{a),INITi,Si)) 

respectively. Then we describe the derivation of the estimation of the earliest time when all 
preconditions in Pre(a) can be achieved, i.e., EndMme(Re\axedP\an(Pre(a), INITi, $)). 

Actions for Pre(a) in the Relaxed Plan 

Pre(a) is {pi,P2,Ps} but, since P2 € INITi, in the first execution of RelaxedPlan step 2 
removes P2 from G. So, only p\ and p^ are the goals of the relaxed problem. Suppose 
that in order to achieve p\ we can use a\, a 2 or as (forming the set A g examined by 
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Figure 4: An example illustrating RelaxedPlan. Square nodes represent action nodes, while 
the other nodes represent fact nodes; solid nodes correspond to nodes in AU {a}; 
dotted nodes correspond to the precondition and action nodes that are considered 
during the evaluation process; the gray nodes are those selected by RelaxedPlan. 



Bestaction(pi) in step 7). Each of these actions is evaluated, and a\ is chosen. In the 
recursive call of RelaxedPlan applied to the preconditions of a\, p*, is not considered because 
it already belongs to INITi. Regarding the other precondition of a\ (^4), suppose that 
04 is the only action achieving it. Then this action is chosen to achieve P4, and since its 
preconditions belong to INITi, they are not evaluated (the new recursive call of RelaxedPlan 
returns an empty action set). 

Regarding the precondition ps of a, assume that it can be achieved only by 05 and a^. 
These actions have a common precondition (^12) that is an effect of 04, an action belonging 
to ACTS (because it was already selected by RelaxedPlan(Pre(ai), INIT h 0)). The other 
preconditions of these actions belong to INITi. Since \Threats{a^)\ = and \Threats{a§)\ = 
1, Bestaction(pz) is 05. Consequently, j4sei(RelaxedPlan(Pre(a), INITi, 0)) is {01,04,05}. 

Actions for Threats(a) in the Relaxed Plan 

Concerning the execution of RelaxedPlan for Threats(a), i.e., RelaxedPlan({g}, INITi, { a i, a 4j 
05}), suppose that the only actions for achieving q are 07 and a%. Since the precondition 
P14 of 07 is an effect of 05, which is an action in the input set A (it belongs to the relaxed 
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subplan computed for the preconditions of a), and Threats^) is empty, the best action 
chosen by RelaxedPlan to support q is a-j. It follows that the set of actions returned by 
RelaxedPlan is {01,04,05,(17}. 

Temporal Value for Pre(a) 

We now consider the evaluation of the temporal value returned by RelaxedPlan(Pre(a), 
INITi, 0). According to the temporal values specified in the table of Figure 4, the value of 
t at step 1 is Time(p2) = 220. As illustrated above, RelaxedPlan for Pre(a) is recursively 
executed to evaluate the preconditions of a\ (the action chosen to achieve p\) and then of 
04 (the action chosen to achieve p^). In the evaluation of the preconditions of 04, at step 1 
of RelaxedPlan(Pre(a4), INITi, 0)) t is set to 50, i.e., the maximum value between Time(pg) 
and Time(pio) and the algorithm returns (0,50). 

In the evaluation of the preconditions ofai, at step 1 of RelaxedPlan(Pre(ai), INITi, 0)) 
t is set to Time(p 5 ) = 170, at step 8 RelaxedPlan(Pre(a4), INITi, 0)) returns (0,50), while 
steps 9-10 set T(p l2 ) to 50+100 (the duration of a 4 ), and at step 13 t is set to MAX{170, 50+ 
100}. Hence, the recursive execution of RelaxedPlan applied to the preconditions of a\ re- 
turns ({a 4 }, 170), and at step 13 of RelaxedPlan(Pre(a), INIT U 0) t is set to MAX{220, 170+ 
70} = 240. 

As we have seen in the first part of the example, the action chosen to support p% is 
05. The recursive execution of RelaxedPlan(Pre(a5), INITi, { a i, a 4}) applied to the precon- 
ditions of 05 returns ({a\, 04}, 170). In fact, the only precondition of 05 that is not in 
INITi (P12) is achieved by an action already in ACTS (04). Moreover, since T(p\2) = 150 
and Time(pu) = 170, the estimated end time of 05 is 170 + 30 = 200. At step 13 of 
RelaxedPlan(Pre(a),P/VP7],0) t is then set to M^A{240,200} and the output of Relaxed- 
Plan is ({di, 04, (15}, 240). 

3.2.3 Estimating the terms of E 

As noted before, the terms of the action evaluation function E are computed by using 
the relaxed (sub)plan 7i> for a set of preconditions. The number of actions in 7i> and the 
threats of these actions are used to define a heuristic estimate of the additional search cost 
that would be introduced by adding an action a to the current TA-graph, or removing it 
(i.e., the Search-cost terms of E). Note that in general this is not an admissible heuristic, 
because it can overestimate the minimum number of search steps needed to cope with the 
inconsistency under consideration. 

The Temporal -cost term of E(a) z is an estimation of the earliest time when the new 
action a would terminate, given the actions in n r and the earliest time when 7i> can be 
applied in the context of the current action graph. 13 The Temporal -cost term of E(a) r is 
an estimation of the earliest time when all preconditions that would become unsupported 
by removing a from the current action graph could be supported again. 

The Execution -cost term of E{a) 1 is an estimation of the additional execution cost that 
would be required to satisfy the preconditions of a, and is derived by summing the cost of 
each action a' in 7r r (Cost(a')). The Executiorucost term of E(a) r is estimated similarly by 

13. The makespan of 7i> is not a lower bound for Temporal-cost(a) because the possible parallelization of 
7ty with the actions already in A is not considered. 
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EvalAdd(a) 

Input: An action node a that does not belong to the current TA-graph. 
Output: A pair formed by a set of actions and a temporal value t. 

1. INITi <— Supported-f acts(Level(a))\ 

2. Rplan <- RelaxedPlan(Pre(a), INIT h 0); 

3. h <- MAX{0, MAX{Time(a') \ ft \= a' -< a}}; 

4. i 2 -5- M AX {t\, EndJime(Rplan)}; 

5. A Aset(Rplan) U {a}; 

6. ify/an <- Relaxed Plan (T/ireais (a), 7A^7T ; - Threats(a), A): 

7. return (Aset(Rplan),t2 + Duration(a)} . 

EvalDel(a) 

Input: An action node a that belongs to the current TA-graph. 
Output: A pair formed by a set of actions and a temporal value t. 

1. INITi <— Supported- f acts(Level(a)): 

2. Rplan^- Re\axedP\an(Unsupjacts(a), INIT U $). 

3. return Rplan. 



Figure 5: Algorithms for estimating the search, execution and temporal costs for the in- 
sertion (EvalAdd) and removal (EvalDel) of an action node a. Rplan is a pair of 
values, identified by Aset(Rplan) and EndJ,ime(Rplan), where the first is a set of 
actions and the second a temporal value. Num-acts, Supported-facts, Duration 
and Threats have been defined in Section 3.2.1. Unsup-f acts(a) denotes the set 
of precondition nodes that become unsupported by removing a from A. 



considering the preconditions of the actions that would become unsupported when removing 
a from the current action graph. More formally, E is defined as follows: 



E( a y 



E(a) r < 



Execution-cost(a) 1 = J2 a 'e ^(EvaiAdd(a)) Cost(a') 
TemporaLcost(a) 1 = End-time(Eva\Add(a)) 

Search.cost{a) 1 = |Aset(EvalAdd(a))| + E a 'eA S ei(EvaiAdd(a)) \Threats(a')\ 

Execution-Cost(a) r = ]T a , e Ase t(E va iDei(a)) Cost(a') - Cost(a) 
TemporaLcost(a) r = End-time(Eva\De\(a)) 

Search.cost(a) r = |Asei(EvalDel(a))| + J2 a '&Aset(E va \De\(a)) \Threats{a')\ 



where EvalAdd(a) and EvalDel(a) are the functions defined in Figure 5. EvalAdd(a) returns 
two values: the set of actions in ir r (Aset) and an estimation of the earliest time when the 
new action a would terminate (End-time). Similarly for Eva I Del (a), which returns the set of 
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actions in the relaxed plan achieving the preconditions that would become unsupported if 
a were removed from A, together with an estimation of the earliest time when all these 
preconditions would become supported. The relaxed subplans used in Eva I Add (a) and 
EvalDel(a) are computed by RelaxedPlan, as described in Section 3.2.1. 

After having computed the state INITi using Supported- f acts(l) , in step 2 EvalAdd uses 
RelaxedPlan to compute a relaxed subplan (Rplan) for achieving the preconditions of the 
new action a from INITi. Steps 3-4 compute an estimation of the earliest time when a can 
be executed as the maximum value over the end times of all the actions preceding a in A 
(t\) and End-time(Rplan) (£2)- Steps 5-6 compute a relaxed plan for Threats(a) taking 
account of a and the actions in the first relaxed subplan. 

Eva I Del is simpler than EvalAdd, because the only new inconsistencies that can be gener- 
ated by removing a are the precondition nodes supported by a (possibly through the no-op 
propagation of its effects) that would become unsupported. Unsup-f acts(a) denotes the 
set of these nodes. 

Of course, an action elimination from A to cope with an inconsistency could remove 
some additional inconsistencies (the unsupported preconditions of the eliminated action). 
Similarly, an action that is added to A to support a certain precondition could support 
additional preconditions as well. However note that, as described in Section 3.1, Walkplan, 
like Walksat, does not consider possible improvements during the evaluation of the search 
neighborhood. Hence, EvalDel and EvalAdd do not take account of additional inconsistencies 
that are removed from A as positive "side-effects" of coping with the inconsistency under 
consideration. 

In order to illustrate the steps of EvalAdd, consider again the example of Figure 4. As 
shown in Section 3.2.2, the pair assigned to Rplan by step 2 of EvalAdd(a) is ({ai, 04, 05}, 240) 
(which is the pair of values returned by RelaxedPlan(Pre(a), INITi, 0)). At step 3 of Eval- 
Add(a) suppose that t\ is set to 230 (i.e., that the highest temporal value assigned to the ac- 
tions in the TA-graph that must precede a is 230). Step 4 sets ti to MAX {230, 240}, and the 
execution of RelaxedPlan({g}, INITi — {q}, a 4j a 5j a }) a t s t e P 6 returns ({ai, 04, 05, a, 07}, 
t q ), where t q is a temporal value that is ignored in the rest of the algorithm, because it does 
not affect the estimated end time of a. Thus, since the duration of a is 30, the output of 
EvalAdd(a) is ({a\, 014, 05, a, 07}, 240 + 30). 

3.3 Computing Reachability and Temporal Information 

The techniques described in the previous subsection for computing the action evaluation 
function use heuristic reachability information about the minimum number of actions re- 
quired to achieve a fact / from INITi {Numuacts(f,l)), and the earliest times for actions 
and preconditions. LPG precomputes Num-acts(fJ) for I = 1 and any fact /, i.e., it esti- 
mates the minimum number of actions required to achieve / from the initial state / of the 
planning problem before starting the search. For I > 1, Num,-acts(f,l) can be computed 
only during search because it depends on which actions nodes are in the current TA-graph 
(at levels preceding /). Since during search many action nodes can be added and removed, 
it is important that the computation of Num,-acts(f,l) is fast. 

Figure 6 gives ComputeReachabilitylnformation, the algorithm used by lpg for comput- 
ing Num,-acts( f, 1) trying to take account of the tradeoff between quality of the estimation 
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ComputeReachabilitylnformation(I, O) 

Input: The initial state of the planning problem under consideration (I) and all ground 
instances of the operators (O); 

Output: An estimate of the number of actions (Nuni-acts) and of the earliest time 
(Time-fact) required to achieve each fact from I. 



1. forall facts / /* the set of all facts is precomputed by the operator instantiation phase */ 

2. if / <E / then 

3. Numjacts(f ', 1) <— 0; Time-fact(f, 1) <— 0; Action(f, 1) <— a s t a rt\ 

4. else Num-acts(f, 1) < 1; 

5. P <- I; F new <-I;A<-0; 

6. while F new / 

7. F 4 F U F new 5 F new 4 

8. while A' = {a G A | Pre(a) C P} is not empty 

9. a •<— an action in A'; 

10. to RequiredActions(7, Pre (a)); 

11. i <- MAX Timejact(f, 1); 

/gPrc(fl) 

12. forall / € Add(a) 

13. if / P U P new or Timejact{f, 1) > (i + Duration(a)) then 

14. Time-fact(f, 1) <— t + Duration(a); 

15. if / £ P U P neM) or Num-acts(f, 1) > (to + 1) then 

16. Numjacbs(f, 1) •<— to + 1; 

17. Action(f, 1) •<— a; 

18. P„ ew <- P„ ew U Add(a) - P; 

19. A <-A- {a}; 



RequiredActions(7, G) 

Input: A set of facts I and a set of action preconditions G; 

Output: An estimate of the minimum number of actions required to achieve all facts in G 
from / (ACTS). 



1. ACTS <- 0; 

2. G<-G-I; 

3. while G/0 

4. </ •<— an element of G; 

5. a •<— Action(g, 1); 

6. ACTS <- ACTS' U {a}; 

7. G <- G U Pre(a) - J - Ue^CTS Add{b); 

8. returnflACTS]). 



Figure 6: Algorithms for computing heuristic information about the reachability of each 
fact. 
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and computational effort to derive it. The same algorithm could be used for (re)computing 
Num,-acts(f,l) after an action insertion/removal for any I > 1 (when I > 1, instead of 7, 
in input the algorithm has Supported-facts(l)). u In addition to Num-acts(f,l), Com- 
puteReachabilitylnformation derives heuristic information about the possible earliest time of 
every fact / reachable from I (Time-f act(f , 1)). lpg can use Time-f act(f , 1) to assign an 
initial temporal value (Time(f)) to any unsupported fact node representing /, instead of 
leaving Time(f) undefined as we indicated in Section 2.2. This can give a more accurate 
estimation of the earliest start time of an action with unsupported preconditions, which 
is defined as the maximum value over the times assigned to its preconditions. Note that 
Time-f act(f , 1) is not updated when actions are added to (or removed from) the current 
TA-graph. 

Before illustrating in detail the algorithm for computing reachability information that 
we used for the competition version of LPG, we should also note that preconditions involv- 
ing numerical quantities are ignored by this technique. A new version taking numerical 
preconditions into account is under development. 

3.3.1 Computation of Num-acts and Time-facts 

For clarity we first describe only the steps of ComputeReachabilitylnformation used to derive 
Nuni-acts, and then we comment on the computation of Time-fact. In steps 1-4, the 
algorithm initializes Num-acts(f, 1) to 0, if / € I, and to -1 otherwise (indicating that / 
is not reachable). Then in steps 5-19 it iteratively constructs the set F of facts that are 
reachable from 7, starting with F = I, and terminating when F cannot be further extended. 
In this forward process each action is applied at most once, when its preconditions are 
contained in the current F. The set A of the available actions is initialized to the set of 
all possible actions (step 5), and it is reduced after each action application (step 19). The 
internal while-loop (steps 8-19) applies the actions in A to the current F, possibly deriving 
a new set of facts F new in step 18. If F new is not empty, F is extended with F new and the 
internal loop is repeated. Since F monotonically increases and the number of facts is finite, 
termination is guaranteed. When an action a in A' (the subset of actions currently in A 
that are applicable to F) is applied, the reachability information for its effects are revised 
as follows. First we estimate the minimum number ra of actions required to achieve Pre(a) 
from I using the subroutine RequiredActions (step 10). Then we use ra to possibly update 
Num-acts(f, 1) for any effect / of a (steps 12, 15-16). If the application of a leads to a 
lower estimation for /, i.e., if ra + 1 is less than the current value of Numjacts(f, 1), then 
Num-acts(f, 1) is set to ra + 1. In addition, a data structure indicating the current best 
action to achieve / from I (Action(f, 1)) is set to a (step 17). 15 

For any fact / in the initial state, the value of Action(f, 1) is a s t ar t (step 3). RequiredAc- 
tions uses Action to derive ra through a backward process starting from the input set of 

14. In order to obtain better performance, for I > 1 LPG uses an incremental version of ComputeReachabil- 
itylnformation updating Nurruacts{f ,1) after each action insertion/removal. We omit the details of this 
version of the algorithm. 

15. In the actual algorithm implemented in LPG, when we set Action(f, 1), we consider also the case in 
which Nurruacts{f , 1) is equal to ra + 1; if the execution cost of a is lower than that cost of the current 
Action(f, 1), or they have the same cost but the a supports / earlier, then Action(f, 1) is revised to a. 
For clarity these details are omitted from the formal description of the algorithm. 
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action preconditions (G), and ending when G C I. The subroutine incrementally constructs 
a set of actions (ACTS) achieving the facts in G and the preconditions of the actions already 
selected (using Action). At each iteration the set G is revised by adding the preconditions 
of the last action selected, and removing the facts belonging to I or to the effects of ac- 
tions already selected (step 7). Termination of RequiredActions is guaranteed because every 
element of G is reachable from I. 

Time-fact(f,l) is computed in a way similar to Num-acts(f,l). Step 3 of Comput- 
eReachabilitylnformation initializes it to 0, for any fact / in the initial state. Then, at every 
application of an action a in the forward process described above, we estimate the earliest 
possible time t for applying a as the maximum value over the times currently assigned to 
its preconditions (step 11). For any effect / of a that has not been considered yet (i.e., that 
is not in F), or that has a temporal value higher than t plus the duration of a, steps 13-14 
set Time-fact(f, 1) to this lower value (because we have found a shorter relaxed plan to 
achieve / from I). 

The complexity of ComputeReachabilitylnformation is polynomial in the number of facts 
and actions in the problem/ domain under consideration. Step 10, the most expensive step 
of the algorithm, is executed 0(|C|) times, where O is the set of all actions, and \0\ is 
the size of this set. It is easy to see that the worst-case time complexity of RequiredAc- 
tions is O(|0|). It follows that the time complexity of ComputeReachabilitylnformation is 
O ( | C | 2 ) . However, we have experimentally observed that very often RequiredActions termi- 
nates returning numbers much smaller than \0\ (i.e, that the number of iterations that the 
algorithm performs is well below \0\). Finally, we observe that the order in which actions 
are examined for their application in the forward process can affect the output results. In 
our current implementation we use a random order. 

Figure 7 illustrates the algorithm with an example. Suppose that the facts in the initial 
state I are /1...8, and that the actions in O are 011...7, where the subscript of the actions 
correspond to the order in which they are applied by the algorithm. The first actions that 
are applied are a\, ai and 03, because their preconditions are in F which is initially set to I. 
The Nurri-acts value of these preconditions is set to zero, because RequiredActions applied 
to them returns zero. In the internal for-loop of the algorithm we update the reachability 
information for each effect of these actions. In particular, consider the effects f\ and fg of 
a\. Since f\ is not a new fact (it belongs to I) and its Num-acts and Time-fact values 
are set to the minimum (initial) values, steps 14 and 16 do not revise them. Since fg is 
a new fact, step 14 sets Time-f 'act(fg) to 10 (i.e., the duration of ai), and step 16 sets 
Nuni-acts(fg) to 1 (ra is zero). Moreover, Action(fg, 1) is set to a\ by step 17. The effects 
of ai and as are handled similarly. 

At this point, since there is no other action that is applicable in F, the internal while- 
loop terminates, F is set to F U {/g, /10, /12}, and F new is set to 0. The set A' of the 
actions in A that are applicable is {04,(35, ae}- Consider the application of 04. We have 
that ra at step 10 is set to 2, because RequiredActions(7, Pre(a^)) sets ACTS to {01,02} 
(note that f\ G I, Action(fg, 1) = a\, Action(f\Q,l) = 02, and all preconditions of these 
actions are in I). Thus, Time-f act(fiz) is set to 80 (i.e., the maximum temporal value 
assigned to a precondition of 04, 30, plus the duration of 04), and Num-acts(fis) to 3. The 
effects of the actions 05 and a§ are handled in a similar way. However, it is worth noting 
that Num-acts(fi^) is first set to 3, when we examine 05, and then revised to 2, when we 
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Figure 7: An example illustrating ComputeReachabilitylnformation. The numbers in paren- 
thesis are Nuni-acts values. O = {a\, ai, ■ a-?}. The subscript of each action 
corresponds to the order in which it is applied. 



examine a^. Analogously, Time-f act{f 15) is first set to 120, and then revised to 80, while 
Action(f 15, 1) is first set to 05 and then to a§. 

Consider now the preconditions of the last applicable action a-j. RequiredActions ap- 
plied to Pre{a-j) by step 10 returns 6, because the set ACTS of actions selected by the 
subroutine is {04, a\, ai, as, 013, a^}. Steps 11 of the ComputeReachabilitylnformation sets t 
to Time-fact(fu) = 120, and hence the Time-f act-value for the new effect fn is set to 
120+20, while its Num-acts-v&lue is set to 6+1. 

3.3.2 Related work on reachability information 

Other techniques for estimating the cost of reaching a fact (or a set of facts) from a cer- 
tain state have been proposed and used in some planners, e.g., HSP (Bonet & Geffner, 
2001), ff (Hoffmann & Nebel, 2001) and SAPA (Do k Kambhampati, 2002). When com- 
paring ComputeReachabilitylnformation with these techniques, we should note that in LPG 
the Nuni-acts-values are used to select the actions forming relaxed plans achieving sets 
of action preconditions or goals (computed by RelaxedPlan). These relaxed plans are then 
used by the action evaluation function E guiding the search. In general, this approach is 
similar to ff's and SAPA's methods, but with some significant differences that we comment 
on below. 
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Bonet & GefTner proposed two basic heuristics for HSP, h max and h a dd- In h a( id the 
(search) cost of a set of facts is the sum of the costs of each individual fact, while in h max it 
is the maximum cost over all individual costs. As noted by Haslum and Geffner (2000), h max 
and h a d(i are approximation of the optimal cost function of a relaxed problem where delete 
effects are ignored. h a dd ignores positive interactions among subgoals that could make one 
goal simpler after a second one has been achieved (this makes h a dd non-admissible). h max 
is admissible, but it is less informative and effective for Bonet &: Geffner's hsp planner. 

A difference between the forward process of our algorithm for computing Nurri-acts and 
Bonet Sz Geffner's forward propagation for computing h a dd is that in our propagation every 
action is applied at most once, while in their propagation it can be considered more than 
once (for computing h max it suffices to apply each action once with an appropriate order). 
This restriction, that we introduced for efficiency reasons, can clearly lead to overestimation 
of reachability costs. By adding a new step between steps 17 and 18 of ComputeReachabil- 
itylnformation that adds to A every action with / as precondition, we can obtain a more 
accurate cost propagation like in h a dd- However, this could slow down the planning process, 
given that reachability information may be (re) computed many times during search. 

Another important difference concerns the use of the subroutine RequiredActions at step 
10 for estimating the cost of reaching a set of preconditions G. Instead of considering 
the maximum value over the costs of the preconditions or the sum of their costs, like in 
h max and h a dd, respectively, we compute a relaxed plan for G, and we count the number 
of actions forming it. This can be seen as an intermediate approach between the h max and 
h a dd methods, aimed at taking account of positive interactions among subgoals. 16 

Finally, another difference concerns the initial set of actions used in the forward process. 
While our set (O) does not contain actions with preconditions that are mutex, Bonet &: 
Geffner's forward processes for computing h max and h a dd contain them. The use of our 
restricted set of actions would make the approximation of h max and h a dd more accurate. 

As observed by Hoffmann (2001), ff's reachability technique is similar to h max , and so 
the previous observations about h max compared to our reachability information hold also 
for ff's technique. Another difference with respect to ff concerns the choice of actions 
forming the relaxed plans. While RelaxedPlan and EvalAdd take threats into account, ff's 
relaxed plans do not consider them. Moreover, ff can generate relaxed plans including 
actions with mutex preconditions, while we exclude such actions. 

Most of the differences with respect to hsp's and ff's reachability information that 
we have outlined appear to also hold when comparing ComputeReachabilitylnformation and 
SAPA's reachability techniques (in particular, the use of RequiredActions for estimating the 
search cost of a set of preconditions, the application of an action at most once in the 
forward process, and the use of a more restrictive set of actions O). Another significant 
difference is that, while SAPA's reachability information concerns execution and temporal 
costs, our information concerns mainly search costs. As a consequence, the action choices in 
RelaxedPlan depend mainly on the search costs (as we pointed out, when there is more than 
one action with the lowest search cost, Bestaction chooses the one with lower execution 
cost). In LPG the execution and temporal costs of the relaxed plans are subsequently 

16. Note that if we replaced step 10 with ra <— "sum of Nurri-acts(f,l) for each / in Pre(a)", then the 
resulting algorithm would be quite similar to h a dd (using the additional step for reconsidering actions 
already applied that we mentioned above). 
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taken into account by the action evaluation function E, using the actions in the computed 
relaxed plans. A major motivation for giving primary importance to search costs was that 
we designed our planner as an any time planning system, that can compute a first solution 
quickly, and then derive additional solutions with incrementally better quality, but requiring 
more CPU-time (this incremental process is described in Section 3.6). 

3.4 Updating Ordering Constraints and Temporal Values 

In this subsection we describe the generation during search of action ordering constraints 
in the current TA-graph A, and the update at each search step of the temporal values 
associated with the fact and action nodes of A. If during search the planner adds an action 
node a to A for supporting a precondition of another action node b, then a -<c b is added 
to 0. Moreover, for each action c in A that is mutex with a, if Level(a) < Level(c), then 
a -<e c is added to otherwise (Level(c) < Level(a)) c -<e cl is added to U. If the planner 
removes a from A, then any ordering constraint involving a is removed from Q. 

The addition/removal of an action node a also determines a possible revision of Time (x), 
the temporal value assigned to any fact and action x that is (directly or indirectly) connected 
to a through the ordering constraints in f2. Essentially, the algorithm for revising the 
temporal values assigned to the nodes of A performs a simple forward propagation starting 
from the effects of a, and updating level by level the times of the actions (together with the 
relative precondition and effect nodes) that are constrained by O, to start after the end of a. 
If every precondition is of type overall and every effect is of type at end, when an action 
node a' is considered for possible temporal revision, Time(a') becomes the maximum value 
over the temporal values assigned to (a) its preconditions and (b) the actions preceding 
a' according to plus the duration of a'. The times assigned to the effect nodes of a' 
are revised accordingly. If a' is the only action node supporting a precondition node /, or 
its temporal value is lower than the value assigned to the other action nodes supporting 
/, then Time(f) is set to Time(a'). For instance, suppose that, in order to support the 
precondition node fi of 04 in the TA-graph in Figure 1, we insert the action node 015 at 
level 4 (see Figure 8). 05 has duration 110 and precondition node /g. Since Time(fs) = 
120, Time(f^ becomes 230, which is propagated to 04 and its effects. Time{a^) becomes 
270, Time(f 12 ) is revised to 220 (Time(a 3 )), Time(f 13 ) is revised to 270 (Time(a 4 )), while 
Time(f%) remains 120 (Time(a,2))- 

Some operators in the domains used for the 3rd IPC contain (pre) conditions of type 
"at end" or "at start", and effects of type "at start" (Fox h Long, 2003), i.e., precondi- 
tions that must hold at the end or at the beginning of the action, and effects that are true at 
the beginning of the action. In the following we revise the definition of Time that we have 
given in Section 2.2 to consider actions involving these types of preconditions/effects. When 
an action node a has a precondition node p of type at end, in the definition of Time(a), 
we use Time(p) — Duration{a) instead of Time(p). 17 While in the definition of Time(a) 
precondition nodes of type at start are treated as precondition nodes of type overall. 



17. In the special cases in which a precondition of a is of type either at end or overall, and it is also an 
effect node of type at start of a, Time(p) is not considered in the definition of Time(a) because the 
action itself makes p true (unless p is also a precondition node of a of type at start). 
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O, = {ai -< c «4! a 2 <c ar, a 2 <c ^5! «5 <c ^4! «i <E «2! a 2 <e «3 a 2 <e 04} 

Figure 8: Update of the TA-graph of Figure 1 after the addition of action node at level 
4 to support the precondition node f-j of 04. Dashed edges form chains of no- 
ops that are blocked by mutex actions. Round brackets contain temporal values 
assigned by T to the fact nodes (circles) and the action nodes (squares). The 
numbers in square brackets represent action durations. "(-)" indicates that the 
corresponding fact node is not supported. 



If an action node a has an effect node e of type at start, when we estimate Time(e), 
instead of using Time(a), we use the minimum value over (1) Time(a'), for any action 
node a' supporting e by an effect of type at end, (2) Time(a") — Duration(a"), for any 
action node a" supporting e by an effect of type at start, and (3) Time(a) — Duration(a) 
(because e is supported at the start time of a). 

When preconditions of type different from over all and effects of type different from 
at end are present in the domain specification, in some cases two mutex actions can par- 
tially overlap. 18 The version of LPG that took part in the 3rd IPC did not handle these 
possible overlaps, and any pair of mutex actions was treated by always imposing an ordering 
constraint between the end of an action and the start of the other one. While this is always 
a sound way of ordering mutex actions, it might over-constrain the actions, introducing 



18. For example, if / is a precondition at start of a and ->f is an effect at end of b, although a and b are 
mutex, a can overlap 6 (e.g., a can start after the start time of b and terminate before the end time of 
6). If a is at a level preceding the level of 6, the only ordering constraint that should be imposed is that 
the start of a is before the end of b. Otherwise, the imposed ordering constraint is that the end of 6 is 
before the start of a, and so the two actions do not overlap. 
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unnecessary delays in the plan. However, in the test problems of the 3rd IPC the possibility 
of overlapping mutex actions is rare, and when it is possible it does not affect the temporal 
quality of the plans significantly. Recently, we have extended the treatment of mutex ac- 
tions in LPG, distinguishing various types of interferences and competing needs. These are 
handled by ordering constraints between different endpoints of the involved actions, allow- 
ing overlapping mutex actions. Experimental results using the SimpleTime variant of the 
Satellite domain show that the new version of lpg generates plans which are about 10% 
better (in terms of makespan) than those computed by the competition version. Moreover, 
the overhead introduced by the more sophisticated management of the temporal information 
is on average negligible. A detailed description of how temporal information is managed in 
the new version of lpg is given in another recent paper (Gerevini, Saetti, & Serina, 2003). 

3.5 Numerical State Variables 

In this subsection we briefly describe how LPG deals with preconditions and effects involving 
numerical quantities. We start with a brief description of numerical preconditions and 
effects in PDDL2.1, and then we show how the plan representation, search neighborhood and 
heuristics that we presented in the previous sections have been extended to handle them. 

In PDDL2.1 a state s for a planning domain involving numerical variables is a pair 
(p(s),v(s)) where p(s) is a set of ground atoms (positive facts), and v(s) = (r\, . . . ,r n ) is 
a tuple of real numbers representing the values of the n numerical variables v 1 , v 2 , v n . 
A numerical expression is an arithmetic expression over the set V of these variables and 
the real numbers. A numerical precondition is a triple (exp, rel, exp'), where exp and exp' 
are numerical expressions, and rel E {<, <, =, >, >} is a relational operator. A numerical 
effect is a triple (v z , ass, exp), where v l E V is a variable, ass E {:=, +=, — =, *=, /=} is an 
assignment operator (using a C-like notation), and exp is a numerical expression. 

In order to handle numerical domains, we have extended the notion of TA-graph with 
numerical fact nodes representing values of numerical variables. For each level I in the 
current action graph A and each numerical variable v l E V , there is a numerical fact node 
representing the value for v % at level I. The resulting tuple of real values at level I is 
denoted by Num-values(l) . These values are derived by applying all actions in A at the 
levels preceding I, starting from the initial level and following the order of the corresponding 
levels. 19 The values of the numerical fact nodes at the initial level, Numjvalues(0) , are 
the real numbers assigned to the corresponding numerical variables in the initial state of 
the planning problem under consideration. Similarly, we can associate with each level I a 
set of facts that are true (Supported- f acts (I)) , given the actions in A at levels preceding I. 
In this way we can define a numerical state si = (Supported-f acts(l), Numjvalues(l)) for 
each level I of A. 

A numerical precondition (exp, rel, exp 1 ) of an action at a level I is supported if and 
only if the values of exp and exp 1 evaluated in si satisfy the relation rel. 



19. This way of ordering actions at levels before / is consistent with the action ordering constraints in A (if 
any). Furthermore, note that if an action a at level / has a numerical precondition involving a numerical 
variable v 3 , then any action b with an effect affecting the value of v 3 is mutex with a. So, if A is a 
TA-graph, then b -<e a 6 O. Otherwise (.4 is a simple LA-graph without temporal information), b -< a 
is implied by the levels of a and b. 
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Every time an action is added/removed to/from a level of A we apply/retract the numer- 
ical effects of the action, which can modify the values associated with some numerical fact 
nodes at the next level. These changes are propagated to the following levels of the graph. 
During this propagation, we identify the numerical preconditions that become supported or 
unsupported. Moreover, if the value of a numerical fact node affecting the duration of an 
action is changed, then we update the duration of this action. 

The local search neighborhood associated with an unsupported numerical precondition 
p = (exp, rel, exp'} of an action a is defined as the set of linear action graphs obtained by 
either removing a, or adding a new action that decreases the "gap" between the values of 
exp and exp' according to rel (possibly supporting p). 20 In the competition version of LPG, 
we considered adding an action only to the level immediately before the level of p (while for 
a boolean unsupported precondition q an action supporting it can be added to any preceding 
level). We are currently studying an extension of the neighborhood in which supporting 
actions can be added to any preceding level also in the case of numerical preconditions. 

We now briefly describe how lpg computes the relaxed plans used by Eva I Add and 
EvalDel for numerical domains. This is done by an extended version of RelaxedPlan handling 
numerical preconditions in a very simple way. Since the current version of ComputeReach- 
abilitylnformation ignores numerical preconditions, there is no Nurri-acts-value associated 
with them. Hence, when RelaxedPlan chooses the (heuristic) best action to support a sub- 
goal g, for each numerical precondition p involved in the definition of Bestaction(g) (see 
Section 3.2.1), Numjacbs{p,l) is replaced by 1, i.e., the estimated minimum cost to satisfy 
any numerical precondition is always 1. (Of course, this is quite a strong assumption giving 
weak information; we are currently working on a new version of the planner using more 
informative heuristics for constructing relaxed plans involving numerical preconditions.) 
Another difference in the definition of Bestaction(g) is that, if g is a numerical precondi- 
tion, instead of considering only the actions supporting g, we consider every action that 
decreases the gap between the values of the expressions forming g. 

The relaxation of the plans computed by the extended version of RelaxedPlan concerns 
both the negative effects, which are ignored for plan validity (but considered to count 
possible threats), and a form of monotonic change of the minimum and maximum possible 
values for the numerical quantities. We start from the numerical initial state INITi = 
(Supported- f acts(l) , Numjvalues(l)) and, for each numerical variable involved in an action 
in the relaxed plan constructed from INITi, we consider only the minimum/ maximum 
values that the variable can assume given the actions already in the relaxed plan. These 
values are monotonically decreased/increased whenever an action is added to the relaxed 
plan. Specifically, we define two tuples of numerical values, v max and v m i n , that are both 
initialized using Numjvalues(l). If an effect of an action in the relaxed plan increases the 
value of a variable v % by a quantity 8, then we increase v % max by 5; while, if it decreases the 
value of v % by S, then we decrease v z min by 5. During the construction of a relaxed plan, 
when we check whether a numerical precondition p = (v x , >,v y ) is supported, we evaluate 
v x > v y considering v^ax as the value assigned to v x , and v y min as the value assigned to 

20. Note that it can be necessary to add more than one action to support a numerical precondition. These 
actions are added to different levels by different search steps. For instance, suppose that p = (x,>, 100) 
is a numerical precondition, and that a is the only action with a numerical effect e increasing the value 
of x. If e increases x by 20 and the current value of x is 30, then we need four actions to support p. 
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v y . If an expression involves more than one numerical variable (e.g., p = (v x — v y , >,v z )), 
we consider the combination of the maximum/ minimum values that is most favorable to 
satisfy the condition (the value of v x is v max , the value of v y is v y min , and the value of v z is 
v min)- Similarly if the expression involves another relational operator. 

3.6 Multi-Criteria Incremental Plan Quality 

As we have seen, our approach can model different plan quality criteria determined by 
action execution costs and action durations. The coefficients a, j3 and 7 of the action 
evaluation function E specified in Section 3.2 are used to weigh the relative importance of 
the execution and temporal costs of E, as well as to normalize them with respect to the 
search cost. Specifically, LPG uses the following function for evaluating the insertion of an 
action node a (the evaluation function E(a) r for removing an action node is analogous): 

E(aY = — — — Execution -cost(a) 1 -\ — TemporaLcost(a) 1 H Search-cost(a) 1 , 

maxET maxET maxs 

where he and ht are non-negative coefficients that weigh the relative importance of the 
execution and temporal costs, respectively. Their values can be set by the user, or they can 
be automatically derived from the expression defining the plan metrics in the formalization 
of the problem. The factors 1/maxET and 1/maxs are used to normalize the terms of E to 
a value less than or equal to 1. The value of maxET is defined as he ■ maxE + Ht ' maxT, 
where maxE (max?) is the maximum value of the first (second) term of E over all TA- 
graphs in the neighborhood, multiplied by the number k of inconsistencies in the current 
action graph; maxs is defined as the maximum value of Search-cost over all possible action 
insert ions /removals that eliminate the inconsistency under consideration. The role of k is to 
decrease the importance of the first two optimization terms when the current plan contains 
many inconsistencies, and to increase it when the search approaches a valid plan. I.e., E(a) % 
can be rewritten as 

E ( a Y = n-(n E -maxi+n T - m ax T ) ' {»E ■ Execution.cost{af + fi T ■ TemporaLcost(aY) + 
+ ' Search-Cost(a)\ 

Without this normalization the first two terms of E could be much higher than the 
value of the third term. This would guide the search towards good quality plans without 
paying sufficient attention to their validity. Instead, we would like to have the search give 
more importance to reducing the search cost, rather than optimizing the quality of a plan, 
especially when the current partial plan contains many inconsistencies, 

Our planner can produce a succession of valid plans where each plan is an improvement 
of the previous ones in terms of quality. The first plan generated is used to initialize a new 
search for a second plan with better quality, and so on. This is a process that incrementally 
improves the quality of the plans, and the search can be stopped at any time to give the 
best plan computed so far (each plan can be written in a file as soon as it is derived) . When 
lpg starts a new search, some inconsistencies are forced in the TA-graph representing the 
previous plan, and the resulting TA-graph is used to initialize the search. Similarly, during 
search some random inconsistencies are forced in the current TA-graph when a valid plan 
that does not improve the plan of the previous search is reached. This is done by choosing 
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a small set R of action nodes that are removed from the action graph together with (1) the 
action nodes supporting their preconditions and (2) the action nodes with a precondition 
supported by an action in R. The elements of R are chosen by taking account of the values 
of he and If He > Ht, we randomly remove action nodes giving higher probability to 
those representing actions with higher execution costs, otherwise preference is given to the 
action nodes having a higher impact on the plan makespan. 21 

In the 3rd IPC, for each test problem attempted, we considered only the first and the 
last solutions generated by LPG within five CPU-minutes. The first solution was used to 
test how fast our planner can be; the last solution to test how good a solution can be. Often 
the first solution has low quality compared to the last one, while the last solution requires 
much more CPU-time than the first. The other fully-automated planners in the competition 
did not exhibit any-time behavior like LPG. So, when we compare our two solutions with 
the single solution derived by the other planners, we should consider that LPG very often 
derives additional solutions of intermediate quality, and requiring intermediate CPU-time. 
In particular, as will be shown in the next section, it can be the case that, when (1) the 
first solution found by LPG requires less CPU-time than any other planner, but has quality 
worse than the best solution found by the other planners, and (2) the last solution of 
lpg has superior quality to all other planners but requires more CPU-time, lpg finds an 
intermediate solution which is still better than the solutions found by all other planners 
and is derived in less CPU-time. 

4. Experimental Results 

All our techniques are implemented in lpg. The system is written in C and is available 
from http://prometeo.ing.unibs.it/lpg. In this section we present some experimental 
results illustrating the efficiency of lpg using the test problems of the 3rd IPC. These 
problems belong to several domains, most of which have some variants containing different 
features ofPDDL2.1. The variants are named "Strips", "SimpleTime" , "Time", "Complex", 
"Numeric" and "HardNumeric" , and are all handled by our planner. For a description of 
the domains and of the relative variants, the reader can visit the official web site of the 3rd 
IPC (www . dur . ac . uk/d . p . long/compet it ion . html) . 

All tests were conducted on the official machine of the competition, an AMD Athlon* m 
MP 1800+ (1500MHz) with 1 Gbyte of RAM. The results for lpg correspond to median 
values over five runs for each problem considered. The CPU-time limit for each run was 5 
minutes, after which termination was forced. 22 Notice that the results that we present here 
are not exactly the same as the official results of the competition, where for lack of time we 
were not able to run our system a sufficient number of times to obtain meaningful statistical 
data. However, in general the new results are very similar to those of the competition, with 

21. In the version of lpg that took part in the 3rd IPC this second preference was based on a simple 
estimation of the temporal impact of each action node. We are currently testing a newer version that 
selects such actions more accurately by using the critical path in the graph of the ordering constraints 
in the TA-graph. 

22. When the CPU-time limit was exceeded in one or two runs, the median values are derived by considering 
these runs as those producing the worst results. When the CPU-time limit was exceeded in three or 
four runs, instead of the median values, we considered the worst results of the remaining successful runs. 
This happened in 16 of the 442 problems solved. 
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Planner 


Solved 


Attempted 


Success ratio 


LPG 


A A O 

442 


A C O 

468 


94% 


r r 


237 




83% 


Simplanner 




1 99 


t o /o 


Sapa 


80 


122 


66% 


MIPS 


331 


508 


65% 


VHPOP 


122 


224 


54% 


Stella 


50 


102 


49% 


TP4 


26 


204 


13% 


TPSYS 


14 


120 


12% 


SemSyn 


11 


144 


8% 



Table 1: Number of problems attempted and solved by the planners that took part in the 
3rd IPC ordered by their success ratio. The data from the planners compared 
with lpg are from the official web site of the 3rd IPC. The data for lpg do not 
consider the 20 problems in Satellite HardNumeric, which are all solved by the 
current version of the planner, slightly improving the success ratio. 



some considerable improvement in Satellite Complex and in the Rovers domains, where 
many problems could not be solved due to a minor bug in the parser of our planner that 
was easily fixed right after the competition. 

Overall, the number of problems attempted in the new tests by our planner was 468 
(over a total of 508 problems), and the success ratio was 94.4% (the problems attempted 
by LPG in the competition were 428 and the success ratio 87%). Figure 1 gives these data 
for every fully-automated planner that took part in the competition. The success ratio of 
LPG is the highest one over all competing domain-independent planners. 

The version of LPG that we used in the competition is integrated with an alternative 
search method that can be activated when the local search is not effective. This method is 
based on the same best-first search technique implemented in ff (Hoffmann & Nebel, 2001). 
The only domain were we used best-first search instead of local search is FreeCells. The 40 
problems that were not attempted by our planner are the 20 problems in Settlers Numeric 
and the 20 problems in Satellite HardNumeric. The first domain contains operators 
with universally quantified effects, which are not handled in the current version of LPG. 
The plan metrics of the problems in the second domain require maximizing the value of 
a certain numerical variable representing acquired data (data-stored), which is another 
feature of PDDL2.1 that the competition version of LPG did not handle properly. Many of 
these problems were solved by the other fully-automated planners by the empty plan or by 
plans with zero quality. While such plans could have been derived also by LPG, we did not 
consider these interesting solutions. 23 

23. Very recently we have extended LPG to handle maximization of plan metric expressions. This new 
version of LPG solves all the 20 test problems of Satellite HardNumeric, generating plans with quality 
higher than zero. The only fully-automated planner of the competition that derived solutions with 
data-stored > is mips. An experimental comparison of mips and LPG considering only these solutions 
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We ran LPG with the same default settings for every problem attempted (maximum 
numbers of search steps and restarts for each run, inconsistency selection strategy, and 
noise factor), that can be modified by the user. The default initial value of the noise p 
is 0.1. Note that this is a dynamic value that is automatically increased/decreased by the 
planner during search, depending on the variance of the number of inconsistencies in the last 
n search steps. In all our tests p was automatically increased if the variance did not change 
significantly in the last 50 search steps. It was set to the initial default value otherwise. 
The parameters he and \it of the action evaluation function were automatically set using 
the (linear) plan metric specified in the problem formalization. In particular, he was set to 
1, while ht was set to the coefficient weighing the total-time variable in the expression 
specifying the plan metric. For instance, in the example of plan metric given at the end of 
Section 2, the coefficient weighing total-time is 4, and so for that problem \it was set to 
4. If no plan metric was specified, then he was set to 0.5 and \it to 0. 

The performance of LPG was tested in terms of both CPU-time required to find a so- 
lution (LPG-speed) and quality of the best plan computed (LPG-quality) using at most five 
CPU-minutes. In the plots of Figures 9, 10, 11 and 12, on the x-axis we have the problem 
names (simplified with numbers); on the y-axis, in the plots for CPU-time we have mil- 
liseconds (logarithmic scale), while in the plots for plan quality we have the quality of the 
plans generated, measured using the plan metric expression in the corresponding problem 
specification. Note that the lower the plan quality values, the better the corresponding 
plans are. 

Figure 9 shows the performance of LPG-speed compared to the other competitors in some 
variants of four domains. 24 In DriverLog Strips, FF is on average the fastest planner, but 
LPG solves more problems, and it scales up somewhat better. In ZenoTravel SimpleTime, 
LPG outperforms the other competitors in terms of both number of problems solved and 
CPU-time (our planner is about one order of magnitude faster). In Satellite Complex 
the excellent performance of lpg is even more evident especially for the largest problems. 
Finally, in Rovers Numeric, FF and LPG perform similarly, but our planner solves a larger 
number of problems. The plots concerning the performance of LPG-quality for these four 
domain variants are given in Figure 10. These results show that the solution computed by 
our planner was always similar to or better than the solution derived by any of the other 
planners. The most interesting differences are in Satellite Complex, where LPG-quality 
produced solutions of higher quality for almost every problem. 

In order to derive some general results about the performance of our approach with 
respect to all other fully-automated planners of the competition, we compared lpg with 
the best result over all these planners. We will indicate these results as if they were produced 
by a hypothetical "SuperPlanner" (which does not exist). Clearly, if LPG performs generally 
better than the SuperPlanner in a certain domain, then in that domain it performs better 
than any other real planner that we considered. On the other hand, if it performs worse, 



shows that the plans generated by lpg have quality much higher than those computed by mips, and that 
our planner is significantly faster (detailed results of this experiment are available from the web page of 
lpg). 

24. Complete results for all other domains and variants are available from http: //prometeo . ing.unibs . it/ 
lpg/test-results. 
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FF (Speed) (15 solved) 
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Figure 9: CPU-time and number of problems solved by the fully-automated planners of the 
3rd IPC for the domains DriverLog Strips, ZenoTravel SimpleTime, Satellite 
Complex and Rovers Numeric. 



this does not necessarily imply that there is a single real planner that generally performs 
better than lpg. 

The plots of Figures 11 and 12 give complete results for Satellite, one of the domains 
where our planner performed particularly well in the temporal and Complex variants. The 
plots on the left show CPU-times for LPG-speed, LPG-quality, and the two corresponding 
versions of the SuperPlanner: a version in which, for each problem, we consider the fastest 
planner over all the other fully-automated planners, and a version in which we consider the 
planner that produced the best quality plan (of course it can be the case that the fastest 
planner for a problem is different from the planner that produces the best quality solution 
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Figure 10: Quality of plans computed by the fully-automated planners of the 3rd IPC for 
the domains DriverLog Strips, ZenoTravel SimpleTime, Satellite Complex, 
Depots Time and Rovers Numeric. In order to improve readability, the plot for 
DriverLog-Strips is given in logarithmic scale. 



for that problem). The plots on the right show plan quality for the two versions of LPG and 
the SuperPlanner. 

The results in these and in the following plots are mostly self explanatory. In the tempo- 
ral and complex variants LPG-speed is often one or more orders of magnitude faster than the 
SuperPlanner. In the Strips variant the SuperPlanner is faster for the smallest problems, 
but it is generally slower for the largest ones. In the Numeric variant the SuperPlanner is 
faster, but our planner produces solutions of better quality. Regarding LPG-quality, in all 
variants except Satellite Strips our planner performs much better than the SuperPlanner. 
In the Strips variant, the quality of the plans produced by LPG-quality is approximately the 
same as the quality of the plans generated by the SuperPlanner. 

Concerning the quality of the solutions computed by LPG-speed and the CPU-time 
required by LPG-quality, as we have described in Section 3.6, it is important to note that 



273 



Gerevini, Saetti & Serina 



Milliseconds 
1e+06 r 



Satellite-Strips 



Number of steps 



Satellite-Strips 



LPG-speed (20 solved) 
LPG-quality (20 solved) 
SuperPlanner (Speed) (20 solved) 
SuperPlanner (Quality) (20 solved) 




— i — LPG-speed (20 solved) 
• LPG-quality (20 solved) 
— x— SuperPlanner (Speed) (20 solved) 
- « - SuperPlanner (Quality) (20 solved) 


l 


1 

J 










■'1 







2 4 



8 10 12 14 16 18 20 2 4 



8 10 12 14 16 18 20 



Satellite-SimpleTime 



Quality 



Satellite-SimpleTime 



LPG-speed (20 solved) 
LPG-quality (20 solved) 
SuperPlanner (Speed) (19 solved) 
SuperPlanner (Quality) (19 solved) 




LPG-speed (20 solved) 
LPG-quality (20 solved) 
SuperPlanner (Speed) (19 solved) 
SuperPlanner (Quality) (19 solved) 



2 4 



8 10 12 14 16 18 20 2 4 




8 10 12 14 16 18 20 



Milliseconds 
1e+07 r 



Satellite-Time 



LPG-speed (20 solved) 
LPG-quality (20 solved) 
SuperPlanner (Speed) (20 solved)? 
SuperPlanner (Quality) (20 solved) '. 



Quality 
700 | 



Satellite-Time 



LPG-speed (20 solved) 
----- LPG-quality (20 solved) 
:-— SuperPlanner (Speed) (20 solved) 
' SuperPlanner (Quality) (20 solved) >f 




Figure 11: Performance of LPG-speed (left plots) and LPG-quality (right plots) compared 
with the SuperPlanner (speed and quality versions) in Satellite Strips, Sim- 
pleTime and Time. 
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Figure 12: Performance of LPG-speed (left plots) and LPG-quality (right plots) compared 
with the SuperPlanner (speed and quality versions) in Satellite Complex and 
Numeric. 



lpg produces additional intermediate solutions, that for clarity are not shown in these plots. 
It is not surprising that very often plan quality for LPG-speed is poor with respect to plan 
quality for LPG-quality, and that the CPU-time required by LPG-quality is much higher that 
the CPU-time required by LPG-speed. Things become less clear if we compare plan quality 
for LPG-speed (or CPU-time for LPG-quality) and plan quality for the speed version of the 
SuperPlanner (or CPU-time for the quality version of the SuperPlanner). 

Given the any time nature of lpg, obviously there is a tradeoff between plan quality 
and speed. The more CPU-time the planner is allowed to run, the better the last solution 
generated. If we want to study this tradeoff experimentally, we need to consider not only 
the first and last solutions that LPG found in the competition tests, but also the interme- 
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Figure 13: Plan quality and the corresponding CPU-milliseconds (logarithmic scale) for the 
solutions found by LPG (five runs) and the SuperPlanner for four problems in 
Satellite Strips, SimpleTime, Time and Numeric. 



diate solutions. In fact, we observed that in several cases LPG generates an intermediate 
solution that has quality better than or similar to the quality of the best plan generated by 
the SuperPlanner, and that requires less or no more CPU-time than the SuperPlanner. In 
Figure 13 we give some support to our claim (a detailed analysis of all intermediate solu- 
tions generated by our planner is beyond the scope of this paper). The plots in this figure 
show CPU-time and quality of all plans generated by LPG (five runs) and by the SuperPlan- 
ner for some problems in the Satellite domain. In Satellite-SimpleTime-pf ilel5 and 
Satellite-Time-pf ile6 (first two plots of Figure 13) lpg's first solutions (LPG-speed) 
require less CPU-time than the first solution found by the SuperPlanner. On the other 
hand, the quality of LPG-speed's solutions are worse than the quality of the solution found 
by the speed version of the SuperPlanner (see also the corresponding plots of Figure 11, 



276 



Planning through Stochastic Local Search and Action Graphs in LPG 



keeping in mind that they are derived from median values over five runs). However, Figure 
13 shows that for these two problems LPG-speed generates additional intermediate solu- 
tions that have quality better than the solutions found by the SuperPlanner, and that 
still require less CPU-time. Moreover, the third and fourth plots of Figure 13 show that 
in Satellite-Strips-pf ile9 and Satellite-Numeric-pf ile3 the SuperPlanner is faster 
than LPG-speed, but LPG finds intermediate solutions of quality better than the best solution 
of the SuperPlanner using less CPU-time than the SuperPlanner and LPG-quality. 

Since our main focus in this paper is temporal planning, it is interesting to compare LPG 
and the SuperPlanner in the Time variant of all competition domains. The detailed results of 
this comparison are given in Appendix B. As shown by the plots in this appendix, LPG-speed 
is usually faster than the SuperPlanner, and it always solves a larger number of problems, 
except in ZenoTravel, where our planner solves one problem less than the SuperPlanner. 
This problem was solved by mips, another planner of the 3rd IPC that performed well in the 
temporal domains (Edelkamp, 2002). The percentage of the problems solved by LPG-speed 
is 95.1%, while those solved by the SuperPlanner is 77.5%. The percentage of the problems 
in which our planner is faster is 81.4%, the percentage in which it is slower is 13.7%. 

Regarding LPG-quality, generally in these domains the quality of the best plans produced 
by our planner is similar to the quality of the plans generated by the SuperPlanner, with 
some significant differences in ZenoTravel, where in a few problems the SuperPlanner 
performs better, and in Satellite, where our planner always performs better. Overall, in 
the Time variant of all the domains the percentages of the problems in which our planner 
produces a solution of better /worse quality are the same as the percentages of the problems 
in which LPG-speed is faster/slower. 

We have also analyzed the performance of LPG with respect to the SuperPlanner for 
all other domains and problems attempted. Appendices C and D give summary results. 25 
As for the Time-problems, in the SimpleTime problems lpg solves more problems than the 
SuperPlanner, and the percentages of problems in which LPG-speed and LPG-quality perform 
better than SuperPlanner are even higher than the corresponding percentages for the Time 
variants. In the Numeric and Strips problems, on average LPG-speed is less efficient than 
the SuperPlanner. This is mainly due to the generally good performance of FF in these 
domains. However, note that LPG-quality on average is better than the SuperPlanner in 
every domain except the Strips version of ZenoTravel. 

Overall, considering all problems attempted, LPG-speed performs better/worse than the 
SuperPlanner in 55.8/38.1% of the problems, while LPG-quality performs better/worse in 
71/11.6% of the problems. 

Finally, we ran our planner on some of the large problems that were used to test the 
hand-coded planners in the 3rd IPC. In this experiment LPG was tested using a PC Pentium 
III, 500 MHz, with 1 Gbyte of RAM, which is more than two times slower than the machine 
used for testing the hand-coded planners. Of course, we did not expect to solve these 
problems more efficiently than the hand-coded planners. This experiment was aimed at 
testing how far we are from planners exploiting domain knowledge. 

Figure 14 shows plots comparing the performance of lpg and the competing hand- 
coded planners for the two temporal variants of Rovers. LPG solved 38 of the 40 problems 

25. The paper in this issue by Long and Fox (2003) presents a detailed statistical analysis of all official 
results of the 3rd IPC. 
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Figure 14: Performance of LPG in two temporal domains designed for hand-coded planners 
competing at the 3rd IPC. lpg was tested on a machine that is more than two 
times slower than the machine used to test the other planners. 



attempted. In terms of plan quality, very often LPG-quality generates plans that are nearly 
as good as those computed by the hand-coded planners, especially in Rovers-SimpleTime- 
HandCoded. Interestingly, given that the machine used to test LPG was slower, in this 
domain LPG-speed appears to perform slightly better than SHOP2 (Nau, Au, Ilghami, Kuter, 
Murdock, Wu, Sz Yaman, 2003). In Rovers-Time-HandCoded LPG-speed can solve most of 
the problems, but it does not perform as well. It remains an open question whether further 
research can reduce this gap significantly, but we are optimistic about this. 
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5. Conclusions and Future Work 

We have presented some new techniques for planning in PDDL2.1 domains that are imple- 
mented in LPG, an incremental (any time) planner producing multi-criteria quality plans. 
LPG was given an award for "distinguished performance of the first order" at the 3rd Inter- 
national Planning Competition, and additional experimental results presented in this paper 
give further evidence of the high performance of our system. 

Other related techniques that are implemented in lpg, but not described here, con- 
cern: the restriction of the search neighborhood when it contains many elements, and their 
evaluation can slow down the search excessively; different strategies to choose the inconsis- 
tency to handle at each search step; the use of Lagrange multipliers in the action evaluation 
function (Gerevini & Serina, 2002, 2003). 

We have already mentioned some directions that we are pursuing to improve our sys- 
tem. These include, in particular, an extension of our algorithm for computing reachability 
information taking account of numerical preconditions and goals (a recent related method 
has been proposed by Hoffmann (2003)). In addition, we intend to test other local search 
strategies for action graphs based on the use of a "tabu list" (Gerevini & Serina, 1999), 
and further types of graph modifications, some of which were implemented in the previous 
version of LPG (Gerevini &: Serina, 2002). This might be especially important for improving 
the incremental plan-quality process. Another possible improvement of this process that 
is worth investigating is the use of dynamic coefficients to weigh the terms of the action 
evaluation function. When we start a new search for a plan of better quality, the weights 
of the terms representing the execution and temporal costs could be increased with respect 
to the term representing the search cost. This could guide the search towards plans better 
than those already derived, which is the purpose of the incremental process. 

Finally, other directions for improving temporal planning in LPG concern the treatment 
of a richer temporal representation to handle upper and lower bounds on the possible action 
durations, as well as the integration of temporal reasoning techniques to deal with temporal 
constraints between actions similar to those that can be stated using Allen's Interval Algebra 
(Allen, 1983) or STP-constraints (Dechter, Meiri, & Pearl, 1991). 
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Appendix A: Mutex Relations in LPG and Related Work 

LPG precomputes a set of mutex relations for the input planning problem using the two al- 
gorithms given in Figure 15, where Add(a) denotes the set of the positive effects of a, Del(a) 
the set of its negative effects, and Pre(a) the set of its preconditions. ComputeMutexFacts 
derives a set of mutex relations between facts, that are used by ComputeMutexActions to 
compute a set of relations between actions. The correctness of this second algorithm is 
obvious since it just applies the original definition of mutex relation (Blum &: Furst, 1997). 

ComputeMutexFacts iteratively constructs a set M of potential mutex relations and the 
set F of all possible facts for the planning problem under consideration. At each iteration 
we consider every possible action a (step 5) to possibly generate a set of new potential mutex 
relations (steps 7-11), and to possibly invalidate other potential mutex relations that have 
already been formulated (steps 12-18). The algorithm terminates when all possible facts 
have been considered (F* = F), and no new potential mutex relations can be generated 
(M* = M). When the algorithm terminates, M contains a set of persistent mutex relations 
between facts. A mutex relation m in M is persistent if there is no state that can be 
reached from the initial state of the problem, using the operators of the domain under 
consideration, in which the facts of m are both true. All mutex relations in the fixed-point 
level of a traditional planning graph are persistent. 

Given an action a, two facts f\ and fi form a potential mutex relation m if (1) one of 
them is a positive effect of a and the other is a negative effect (steps 7-9), or (2) one of them 
is a positive effect of a and the other is (potentially) mutually exclusive with a precondition 
of a (steps 7, 10 and 11). (1) is a natural way of hypothesizing mutex relations that is 
used also by Gerevini and Schubert (1998). (2) is based on the observation that, if f\ is an 
effect of a, p € Pre(a), fi $ Add(a), and fi is mutually exclusive with p, then in any state 
resulting from the application of a to a reachable state, fi and f\ cannot be both true. 

A potential mutex relation m € M between f\ and fi becomes invalid if (1) there exists 
an action containing the two facts of m among its positive effects (steps 13-14), or f\ (/2) is 
an add-effect of an action a, /2 (/1) is not deleted by a, and /2 (/1) is (potentially) mutually 
exclusive with no precondition of a (steps 15-18). The first case if obvious, while the second 
can be explained as follows. If f\ is a positive effect of a, and we cannot exclude that fi 
is true in a state where a can be applied, then /2 could persist from this state to the state 
produced by a (similarly if fi is a positive effect of a). 

Note that lpg handles negative preconditions as proposed by Koehler et al. (1997), i.e., 
no explicit atomic negation is available in lpg's language. Instead we model atomic negation 
by introducing an additional predicate not-p(x) if ~^p(x) is needed and by formulating add 
and delete effects correspondingly (this guarantees than not-p(x) and p(x) are mutex). 

The next theorem states the correctness of our algorithms. 

Theorem ComputeMutexFacts and ComputeMutexActions correctly compute a set of persis- 
tent mutex relations between facts and actions respectively. 

Proof. Correctness of ComputeMutexActions is obvious, since it is a direct consequence 
of the definition of persistent mutex relation between actions. Correctness of ComputeMu- 
texFacts follows from the two conditions under which a potential mutex relation is made 
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ComputeMutexFacts(J, 0) 

Input: An initial state (I) and all ground operator instances (0); 
Output: A set of persistent mutex relations between facts (M). 

1 F* <- I; F <r- 0; 

2. M 4r- 0; M* <- 0; A <- 0; 

3. while F* + F N M* + M 



4. F^-F*;M^-M*; 

5. forall a € such that Pre(a) C F* and -n(3p, g £ Pre(a) A (p, g) £ M*) 

6. jVew(a) <- Add(a) - F*; 

7. forall / £ New (a) 

8. forall /t £ Del(a) 

9. M* <— M* U {(/, h), (h, /)}; /* Potential mutex relation */ 

10. forall (p, q) £ M* such that p £ Pre(a) and g Del(a) 

11. M* <- M* U {(/, g), (g, /)}; /* Potential mutex relation */ 

12. if a A then 

13. forall p, q £ ^4g?g?(gi) such that (p, q) £ M* 

14. M* <- M* - {(p, q), (q, p)}; /* Invalid mutex relation */ 

15. L 4- Add(a) - New{a): 

16. forall (i, q) £ M* such that i £ L 

17. if q <£ Del{a) A -.(3 p £ Pre(a) A (p, g) £ M*) then 

18. M* <— M* — {(i, q), (g, i)}; /* Invalid mutex relation */ 

19. F* <- F* U New (a); 

20. A <- A U {a}; 



21. return M. 



ComputeMutexActions(M, 0) 

Input: A set of mutex relations between facts (M) and all ground operator instances (0); 
Output: A set of persistent mutex relations between actions (N). 

1. <— 0; 0* •<— extended with the no-op of every fact; 

2. forall (p, g) £ M 

3. forall a £ 0* such that p £ Pre(a) 

4. forall 6 £ 0* such that g £ Pre (b) 

5. N <- N U {(a, b), {b, a)}; /* Competing needs */ 

6. forall a £ 0* 

7. forall p £ Pre(a) 

8. forall 6 £ such that p £ £>eZ(6) 

9. N <- N U {(a, 6), (6, a)}; /* Interference */ 

10. forall p £ Add{a) 

11. forall 6 £ such that p £ £>eZ(6) 

12. N NU {(a, 6), (6, a)}; /* Inconsistent effects */ 

13. return N. 



Figure 15: LPG's algorithms for computing the mutex relations. 
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invalid by the algorithm, and it can be proved by an inductive argument on the number k 
of actions applied to reach a state S from the initial state. 

Induction base (k = 0). It is easy to see that each element m in the output set M is a valid 
mutex relation for the initial state (S = I), because the algorithm cannot formulate mutex 
relations involving two facts that are both true in the initial state. 

Induction hypothesis (k = n). Suppose that any element m in the output set M is a valid 
mutex relation in any state reached by the application of n actions (n > 1). 
Induction step (k = n + 1). Assume that there exists an element m in the output set M 
that is not a valid mutex relation in a state S reachable by applying a sequence of n + 1 
actions (because the two facts f\ and /2 of m are both true in S), and let a n+ i be the last 
action in this sequence. By the inductive assumption this can happen only if (i) f\ and 
/2 are both positive effects of a n +i, or (ii) f\ (^2) is an add-effect of a n +i, j'2 (/1) is not 
deleted by a„+i, and fi (/1) is true in the state S' where a n+ \ is applied. Case (i) is ruled 
out by steps 13-14 of ComputeMutexActions. Regarding case (ii), since we are assuming 
that S' is a reachable (consistent) state where f'2 (/1) is true and a n+ \ can be applied, there 
must exist no precondition p of a n+ i that is mutex with (/1). Moreover, by the inductive 
assumption (p, $2) {(p, /1)) cannot belong to the output M-set - if some iteration of the 
algorithm adds the potential mutex relation between p and fi (/1) to M, then it must be 
the case that it is then removed from M. It follows that, if some iteration adds (/1, ^2) to 
M, steps 16-18 will then remove it from M, contrary to our assumption that m belongs to 
the output M-set. 

Termination of the two algorithms is guaranteed because there is always a finite maxi- 
mum number of different facts, actions and potential mutex relations. □ 

Smith and Weld proposed the notion of "eternal mutex" (emutex) as a mutex relation 
that persists for all time (Smith &: Weld, 1999). According to their definition of emutex, our 
persistent mutex relations between facts and between actions subsume theirs. Concerning 
emutex relations between an action and a fact, Smith and Weld consider an action a with 
effect p emutex with a fact p, while we do not consider a and no-op(p) persistently mutex. 

Bonet and Geffner (2001) proposed a method for deriving a set of mutex relations 
between facts that has some similarities with ours. Both methods are based on hypothesizing 
a set of pairs of mutex facts that are then possibly eliminated from the set according to 
certain conditions on the preconditions and effects of the actions. However, there are also 
some significant differences. While Bonet and Geffner compute an initial large set Mo of 
candidate mutex pairs, and then prune it, ComputeMutexFacts incrementally constructs and 
verifies the set M through a forward process. The conditions under which a pair of facts is 
in Mo are different from the conditions used by ComputeMutexFacts to create M (especially 
the condition in step 10). Moreover, our algorithm generates and tests the pairs of M 
considering only applicable actions (i.e., actions with all preconditions in F* and with non- 
mutex preconditions), while Bonet and Geffner derive Mo using every operator instance. 
Finally, their paper does not contain algorithmic details about the identification of "bad 
pairs" in Mo, and there is no formal proof of correctness. 

For problems involving a very high number of actions, precomputing mutex relations 
could be computationally very expensive. In order to cope with these cases, the user of lpg 
can set an option of the planner (lowmemory) for computing the mutex relations between 
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actions at search time (while those between facts and between actions and no-ops are still 
precomputed). Preprocessing with lowmemory on becomes faster and requires much less 
memory, but each search step becomes slower. For this reason in the current version of 
LPG this option is recommended only when the precomputation of mutex relations between 
actions is prohibitive. This was never the case for the test problems of the 3rd IPC designed 
for the fully-automated planners, but for some of the problems designed for the hand- 
coded planners, like those of the domain Satellite Hand-Coded, the use of this option 
is necessary. Currently we are studying an alternative method for computing (persistent) 
mutex relations during search based on the use of state invariants automatically derived 
by existing domain analysis tools, such as Discoplan (Gerevini Sz Schubert, 1998) or Tim 
(Fox &: Long, 1998a). A similar method has been proposed by Fox and Long (2000). 

Finally, for domains involving numerical preconditions and effects, the set of mutex 
relations between actions computed by the algorithms of Figure 15 is extended using the 
definition of mutex relations for numeric domains given by Fox and Long (2003). 
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Appendix B: LPG and the SuperPlanner in the Time variant of the 
competition domains 
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Appendix C: Comparison of LPG-speed and the SuperPlanner 

The following table shows the performance of LPG-speed and the SuperPlanner in every vari- 
ant of every domain tested using our local search techniques. The two systems are compared 
in terms of: number of problems solved (2nd and 3rd columns); number of problems in which 
LPG-speed is faster/slower than the SuperPlanner (4th/6th columns); number of problems 
in which LPG-speed is much faster/slower than the SuperPlanner (5th/7th columns). A 
system was considered much faster than the other one when the CPU-time required by the 
first was at least one order of magnitude lower than the second. When a planner was not 
able to find a solution, the required CPU-time was considered infinite. 



Domain 


Problems 
solved 
by 
LPG 


Problems 
solved by 
the Super- 
Planner 


LPG 

better then 
the Super- 
Planner 


LPG much 
better then 
the Super- 
Planner 


LPG 

worse than 
the Super- 
Planner 


LPG much 
worse than 
the Super- 
Planner 


Strips 

Depot S 

DriverLog 

Rovers 

Satellite 

ZenoTravel 


1 J. UU /0 ) 

20 (100%) 
20 (100%) 
20 (100%) 
19 (95%) 


15 (75%) 
20 (100%) 
20 (100%) 
20 (100%) 


fi (97 3%"1 

U \£ t . O /Of 

7 (35%) 
4 (20%) 
6 (30%) 
(0%) 


U /Of 

5 (25%) 

(0%) 

1 (5%) 
(0%) 


1 fi (79 7%*\ 
1U ^ I I /Of 

12 (60%) 
14 (70%) 
14 (70%) 
20 (100%) 


K C99 7%-) 

i /o f 

1 (5%) 
3 (15%) 

2 (10%) 
12 (60%) 


Total 


99% 


95.1% 


22.5% 


5.9% 


74.5% 


22.5% 


Simple-time 

Depots 

DriverLog 

Rovers 

Satellite 

ZenoTravel 


21 (95.5%) 

18 (90%) 
20 (100%) 
20 (100%) 

19 (95%) 


11 (50%) 
16 (80%) 
10 (50%) 
19 (95%) 
16 (80%) 


18 (81.8%) 
15 (75%) 

17 (85%) 

18 (90%) 
18 (90%) 


14 (63.6%) 
6 (30%) 
12 (60%) 
12 (60%) 
9 (45%) 


3 (13.6%) 
3 (15%) 
1 (5%) 
1 (5%) 
1 (5%) 


(0%) 
2 (10%) 

1 (5%) 
1 (5%) 
(0%) 


Total 


96% 


70.6% 


83.4% 


51.9% 


8.8% 


3.9% 


Time 

Depots 

DriverLog 

Rovers 

Satellite 

ZenoTravel 


20 (90.9%) 

18 (90%) 
20 (100%) 
20 (100%) 

19 (95%) 


11 (50%) 
16 (80%) 

12 (60%) 
20 (100%) 
20 (100%) 


14 (63.6%) 

17 (85%) 

18 (90%) 

19 (95%) 

15 (75%) 


12 (54.5%) 
6 (30%) 

13 (65%) 
12 (60%) 
(0%) 


6 (27.3%) 

(0%) 
2 (10%) 

1 (5%) 
5 (25%) 


2 (9.1%) 

(0%) 

1 (5%) 

(0%) 

1 (5%) 


Total 


95.1% 


77.5% 


81.4% 


42.1% 


13.7% 


3.9% 


Numeric 

Depots 

DriverLog 

Rovers 

Satellite 

ZenoTravel 


21 (95.5%) 
18 (90%) 
17 (85%) 
12 (60%) 
20 (100%) 


20 (90.9%) 
16 (80%) 
9 (45%) 
14 (70%) 
20 (100%) 


8 (36.4%) 
7 (35%) 
10 (50%) 
2 (10%) 
(0%) 


2 (9.1%) 

3 (15%) 
8 (40%) 
2 (10%) 
(0%) 


12 (54.5%) 
10 (50%) 
7 (35%) 
14 (70%) 
20 (100%) 


2 (9.1%) 

3 (15%) 
3 (15%) 
5 (25%) 
5 (25%) 


Total 


83.6% 


77.4% 


26.4% 


14.7% 


61.8% 


17.6% 


Complex 
Satellite 


20 (100%) 


17 (85%) 


19 (95%) 


14 (70%) 


1 (5%) 


1 (5%) 


Hard-numeric 

DriverLog 


20 (100%) 


16 (80%) 


12 (60%) 


5 (25%) 


8 (40%) 


2 (10%) 


Total 


94.6% 


80.3% 


55.8% 


30.3% 


38.1% 


11.6% 
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Appendix D: Comparison of LPG-quality and the SuperPlanner 

The following table shows the performance of LPG-quality and the SuperPlanner in every 
variant of every domain tested using our local search techniques. The two systems are 
compared in terms of: number of problems solved (2nd and 3rd columns); number of 
problems in which the quality of the solution computed by LPG is better/worse than the 
solution computed by the SuperPlanner (4th/6th columns); number of problems in which 
the solution of LPG-quality is much better/worse than the solution of the SuperPlanner 
(5th/7th columns). A solution n derived by a system is considered much better than the 
solution 7r' for the same problem derived by the other system if the quality of n is at least 
twice as good as the quality of 7r', or if n exists and n' does not exist (because the system 
could not solve the corresponding problem). The quality of a plan is measured using the 
plan metric indicated in the problem specification, except for the Strips problems, where 
plan quality is defined as the number of actions. In all problems considered, the lower the 
value of the metric expression, the better the plan is. 



Doniciin 


Problems 
solved 
by 
LPG 


Problems 
solved by 
the Super- 
Planner 


T PP 
1 . 1 V i 
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i , i v i mucn 
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1 . 1 t 
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Planner 
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the Super- 
Planner 
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22 (100%) 
20 (100%) 
20 (100%) 
20 (100%) 
19 (95%) 
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15 (75%) 
20 (100%) 
20 (100%) 
20 (100%) 


12 (54.5%) 
14 (70%) 

9 (45%) 

13 (65%) 
3 (15%) 


(0%) 
5 (25%) 
(0%) 
(0%) 
(0%) 


4 (18.2%) 

(0%) 

1 (5%) 
(0%) 
9 (45%) 


(0%) 
(0%) 
(0%) 

(0%) 

1 (5%) 


Total 


99% 


95.1% 


50% 


4.9% 


13.7% 


0.9% 


Simple-time 

Depots 

DriverLog 

Rovers 

Satellite 

ZenoTravel 


21 (95.5%) 

18 (90%) 
20 (100%) 
20 (100%) 

19 (95%) 


11 (50%) 
16 (80%) 
10 (50%) 
19 (95%) 
16 (80%) 


19 (86.4%) 
17 (85%) 

20 (100%) 
20 (100%) 
17 (85%) 


11 (50%) 
3 (15%) 
10 (50%) 
7 (35%) 
3 (15%) 


1 (4.5%) 

1 (5%) 
(0%) 
(0%) 

2 (10%) 


(0%) 
(0%) 
(0%) 
(0%) 
(0%) 


Total 


96% 


70.6% 


91.2% 


33.3% 


3.9% 


0% 


Time 

Depots 

DriverLog 

Rovers 

Satellite 

ZenoTravel 


20 (90.9%) 

18 (90%) 
20 (100%) 
20 (100%) 

19 (95%) 


11 (50%) 
16 (80%) 

12 (60%) 
20 (100%) 
20 (100%) 


17 (77.3%) 

17 (85%) 

18 (90%) 
20 (100%) 
11 (55%) 


9 (40.9%) 

4 (20%) 
8 (40%) 

5 (25%) 
(0%) 


2 (9.1%) 

1 (5%) 

2 (10%) 
(0%) 
9 (45%) 


(0%) 
(0%) 
(0%) 
(0%) 
3 (15%) 


Total 


95.1% 


77.4% 


81.4% 


25.5% 


13.7% 


2.9% 


Numeric 

Depots 

DriverLog 

Rovers 

Satellite 

ZenoTravel 


21 (95.5%) 
18 (90%) 
17 (85%) 
12 (60%) 
20 (100%) 


20 (90.9%) 
16 (80%) 
9 (45%) 
14 (70%) 
20 (100%) 


10 (45.5%) 
15 (75%) 

8 (40%) 
12 (60%) 

9 (45%) 


1 (4.5%) 

2 (10%) 
8 (40%) 
7 (35%) 
1 (5%) 


8 (36.4%) 
(0%) 
(0%) 
4 (20%) 
6 (30%) 


1 (4.5%) 
(0%) 
(0%) 
4 (20%) 
3 (15%) 


Total 


86.3% 


77.4% 


52.9% 


18.6% 


17.6% 


7.8% 


Complex 
Satellite 


20 (100%) 


17 (85%) 


19 (95%) 


9 (45%) 


1 (5%) 


(0%) 


Hard-numeric 

DriverLog 


20 (100%) 


16 (80%) 


18 (90%) 


5 (25%) 


1 (5%) 


(0%) 


Total 


94.6% 


80.3% 


71% 


21.9% 


11.6% 


2.7% 
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